Java Tutorial

The Document Object Model (DOM)

As we saw in the previous chapter, a DOM parser presents you with an object encapsulating the entire XML structure. You can then call methods belonging to this object to navigate through the document tree and process the elements and attributes in the document in whatever way you want. This is quite different to SAX as we have already noted, but nonetheless there is quite a close relationship between DOM and SAX.

The mechanism for getting access to a DOM parser is very similar to what we used to obtain a SAX parser. You start with a factory object that you obtain like this:

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();

The newInstance() method is a static method in the factory class for creating factory objects. As with SAX, this approach of dynamically creating a factory object that you then use to create a parser allows you the change to parser you are using without modifying or recompiling your code. You use the factory object to create a DocumentBuilder object that encapsulates a DOM parser:

DocumentBuilder builder = null;
try {
  builder = builderFactory.newDocumentBuilder();
} catch(ParserConfigurationException e) {
  e.printStackTrace();
}

As we shall see, when a DOM parser reads an XML document, it makes it available in its entirety as an object of type Document. The name of the class that encapsulates a DOM parser has obviously been chosen to indicate that it can also build new Document objects. A DOM parser can throw exceptions of type SAXException and parsing errors are handled in essentially the same way in DOM and SAX2. The DocumentBuilderFactory, DocumentBuilder, and ParserConfigurationException classes are all defined in the javax.xml.parsers package. Let's jump straight in and try this out for real.

Try It Out – Creating an XML Document Builder

Here's the code to create a document builder object:

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.SAXException;

public class TryDOM {
  public static void main(String args[]) {
     DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
     DocumentBuilder builder = null;
     try {
       builder = builderFactory.newDocumentBuilder();
     }
     catch(ParserConfigurationException e) {
       e.printStackTrace();
       System.exit(1);
     }
     System.out.println("Builder Factory = " + builderFactory +"\nBuilder = "
                                                                      + builder);
  }
}

I got the output:

Builder Factory = org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@430b5c
Builder = org.apache.xerces.jaxp.DocumentBuilderImpl@9ed927

How It Works

The static newInstance() method in the DocumentBuilderFactory class returns a reference to a factory object. We call the newDocumentBuilder() method for the factory object to obtain a reference to a DocumentBuilder object that encapsulates a DOM parser. This will be the default parser. If we want the parser to validate the XML or provide other capabilities, we need to set the parser features before we create the DocumentBuilder object by calling methods for the DocumentBuilderFactory object.

You can see that we get a version of the Crimson parser as a DOM parser. Many DOM parsers are built on top of SAX parsers and this is the case with both the Crimson and Xerces parsers.

Setting DOM Parser Features

The idea of a feature for a DOM parser is the same as with SAX – a parser option that can be either on or off. The DocumentBuilderFactory object has the following methods for setting DOM parser features:

setNamespaceAware(boolean aware)	Calling this method with a true argument sets the parser to be namespace aware. The default setting is false.
setValidating(boolean validating)	Calling this method with a true argument sets the parser to validate the XML in a document as it is parsed. The default setting is false.
setIgnoringElementContentWhitespace(boolean ignore)	Calling this method with a true argument sets the parser to remove ignorable whitespace in element content so the Document object produced by a parser will not contain ignorable whitespace. The default setting is false.
setIgnoringComments(boolean ignore)	Calling this method with a true argument sets the parser to remove comments as the document is parsed. The default setting is false.
setExpandEntityReferences(boolean expand)	Calling this method with a true argument sets the parser to expand entity references. The default setting is true.
setCoalescing(boolean coalesce)	Calling this method with a true argument sets the parser to convert CDATA sections to text and append it to any adjacent text. The default setting is false.

As you see, by default the parser that is produced is neither namespace aware nor validating. We should at least set these two features before creating our parser. This is quite simple:

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
builderFactory.setValidating(true);

If you add the shaded statements to the example, the newDocumentBuilder() method for the factory object should now return a validating and namespace aware parser. With a validating parser, we should define an ErrorHandler object that will deal with parsing errors. You identify the ErrorHandler object to the parser by calling the setErrorHandler() method for the DocumentBuilder object:

builder.setErrorHandler(handler);

Here handler refers to an object that implements the three methods declared in the org.xml.sax.ErrorHandler interface. We discussed these in the previous chapter in the context of SAX parser error handling, and the same applies here. If you do create a validating parser, you should implement and register an ErrorHandler object. Otherwise the parser may not work properly.

The factory object has methods to check the status of parser features corresponding to each of the setXXX() methods above. The checking methods all have corresponding names of the form isXXX(), so to check whether a parser will be namespace aware, you call the isNamespaceAware() method. Each method returns true if the parser to be created will have the feature set, and false otherwise.

Parsing a Document

Once you have created a DocumentBuilder object, you just call its parse() method with a document source as an argument to parse a document. The parse() method will return a reference of type Document to a object that encapsulates the entire XML document. The Document interface is defined in the org.w3c.dom package.

There are five overloaded versions of the parse() method that provide various options for you to identify the source of the XML document. They all return a reference to a Document object:

parse(File file)	Parses the document in the file identified by file.
parse(String uri)	Parses the document at the URI, uri.
parse(InputSource source)	Parses the document from source.
parse(InputStream stream)	Parses the document read from the input stream, stream.
parse(InputStream stream, String systemID)	Parses the document read from the input stream, stream. The second argument, systemID, is used to resolve relative URIs.

All five versions of the parse method can throw three types of exception. An exception of type IllegalArgumentException will be thrown if you pass null to the method for the parameter that identifies the document source. The method will throw an exception of type IOException if any I/O error occurs, and of type SAXException in the event of a parsing error. Both these last exceptions must be caught. Note that it is a SAXException that can be thrown here. Exceptions of type DOMException only arise when you are navigating the tree for a Document object.

You could parse() a document using the DocumentBuilder object, builder, like this:

File xmlFile = new File("D:/Beg Java Stuff/Address.xml"); 
Document xmlDoc = null;
 try {
   xmlDoc = builder.parse(xmlFile);
 }
   catch(SAXException e) {
   e.printStackTrace();
   System.exit(1);
 }
   catch(IOException e) {
   e.printStackTrace();
   System.exit(1);
 }

This code fragement requires imports for the File and IOException classes in the java.io package as well as the org.w3c.dom.Document class name. We can now call methods for the xmlDoc object to navigate through the elements in the document tree structure. Let's look at what the possibilities are.

Navigating a Document Object Tree

The Node interface that is defined in the org.w3c.dom package is fundamental to all objects that encapsulate components of an XML document, and this includes the Document object itself. The subinterfaces of Node that identify components of a document are:

Element	Represents an XML element.
Attr	Represents an attribute for an element.
Text	Represents text that is part of element content. This interface is a subinterface of CharacterData, which in turn is a subinterface of Node. References of type Text will therefore have methods from all three interfaces.
CDATASection	Represents a CDATA section – unparsed character data.
Comment	Represents a document comment. This interface also extends the CharacterData interface.
DocumentType	Represents the contents of a DOCTYPE declaration.
Entity	Represents an entity that may be parsed or unparsed.
EntityReference	Represents a reference to an entity.
Notation	Represents a notation declared in the DTD for a document. A notation is a definition of an unparsed entity type.
ProcessingInstruction	Represents a processing instruction for an application.

Each of these interfaces declares its own set of methods and inherits the method declared in the Node interface. Every XML document will be modeled as a hierarchy of nodes that will be accessible as one or other of the interface types in the table above. At the top of the node hierarchy for a document will be the Document node that is returned by the parse() method. Each type of node may or may not have child nodes in the document hierarchy, and those that do can only have certain types of child node. The types of nodes in a document that can have children are as follows:

Node Type	Possible Children
Document	Element (only 1), DocumentType (only 1), Comment, ProcessingInstruction
Element	Element, Text, Comment, CDATASection, EntityReference, ProcessingInstruction
Attr	Text, EntityReference
Entity	Element, Text, Comment, CDATASection, EntityReference, ProcessingInstruction
EntityReference	Element, Text, Comment, CDATASection, EntityReference, ProcessingInstruction

Of course, what each node may have as children follows from the XML specification, not just the DOM specification. There is one other type of node that extends the Node interface – DocumentFragment. This is not formally part of a document in the sense that a node of this type is a programming convenience. It is used to house a fragment of a document – a sub-tree of elements – for use when moving fragments of a document around, for instance, so it provides a similar function to a Document node but with less overhead. A DocumentFragment node can have the same range of child nodes as an Element node.

The starting point for exploring the entire document tree is the root element for the document. We can obtain a reference to an object that encapsulates the root element by calling the getDocumentElement() method for the Document object:

Element root = xmlDoc.getDocumentElement();

This method returns the root element for the document as type Element. You can also get the node corresponding to the DOCTYPE declaration as type DocumentType like this:

DocumentType doctype = xmlDoc.getDoctype();

The next step is to obtain the child nodes for the root element. We can use the getChildNodes() method that is defined in the Node interface for this. This method returns a NodeList reference that encapsulates all the child elements for that element. You can call this method for any node that has children, including the Document node if you wish. We can therefore obtain the child elements for the root element with the following statement:

NodeList children = root.getChildNodes();

A NodeList reference encapsulates an ordered collection of Node references, each of which may be one or other of the possible node types for the current node. So with an Element node, any of the Node references in the list that is returned can be of type Element, Text, Comment, CDATASection, EntityReference, or ProcessingInstruction. Note that if there are no child nodes, the getChildNodes() method will return a NodeList reference that is empty, not null. You call the getChildNodes() method to obtain a list of child nodes for any node type that can have them.

The NodeList interface declares just two methods:

getLength()	Returns the number of nodes in the list as type int.
item(int index)	Returns a reference of type Node to the object at position index in the list.

We can use these methods to iterate through the child elements of the root element, perhaps like this:

Node[] nodes = new Node[children.getLength()];
for(int i = 0 ; i<nodes.getLength() ; i++)
  nodes[i] = children.item(i);

Of course, we will normally be interested in the specific types of nodes that are returned so we will want to extract them as specific types or at least determine what they are before processing them. This is not difficult. You can test the type of any node using the instanceof operator. Here's one way we could extract just the child nodes that are of type Element:

java.util.Vector elements = new java.util.Vector();
Node node = null;
for(int i = 0 ; i<nodes.getLength() ; i++) {
  node = children.item(i);
if(node instanceof Element)
  elements.add(node);
}

A simple loop like this is not a very practical approach to navigating a document. In general we have no idea of the level to which elements are nested in a document and this loop only examines one level. We need an approach that will allow any level of nesting. This is a job for recursion. Let's put together a working example to illustrate this.

Try It Out – Listing a Document

We can extend the previous example to list the nodes in a document. We will add a static method to the TryDOM class to list child elements recursively. We will output details of each node followed by its children. Here's the code:

import javax.xml.parsers.*;
import org.xml.sax.*;
import org.w3c.dom.*;

import java.io.File;
import java.io.IOException;

public class TryDOM implements ErrorHandler {
  public static void main(String args[]) {
    if(args.length == 0) {
      System.out.println("No file to process."+
                           "Usage is:\njava TryDOM \"filename\"");
                           
      System.exit(1);
    }
    File xmlFile = new File(args[0]);
    DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
    builderFactory.setNamespaceAware(true);       // Set namespace aware
    builderFactory.setValidating(true);           // and validating parser feaures

    DocumentBuilder builder = null;
    try {
      builder = builderFactory.newDocumentBuilder();  // Create the parser
      builder.setErrorHandler(new TryDOM()); //Error handler is instance of TryDOM

    } catch(ParserConfigurationException e) {
      e.printStackTrace();
      System.exit(1);
    }
    Document xmlDoc = null;

    try {
      xmlDoc = builder.parse(xmlFile);

    } catch(SAXException e) {
      e.printStackTrace();

    } catch(IOException e) {
      e.printStackTrace();
    }
    DocumentType doctype = xmlDoc.getDoctype();       // Get the DOCTYPE node
    System.out.println("DOCTYPE node:\n" + doctype);  // and output it
    System.out.println("\nDocument body contents are:");
    listNodes(xmlDoc.getDocumentElement(),"");         // Root element & children
  }
  
  // output a node and all its child nodes
  static void listNodes(Node node, String indent) {
    String nodeName = node.getNodeName();
    System.out.println(indent+nodeName+" Node, type is "
                                             +node.getClass().getName()+":");
    System.out.println(indent+" "+node);

    NodeList list = node.getChildNodes();       // Get the list of child nodes
    if(list.getLength() > 0) {                  // As long as there are some...
      System.out.println(indent+"Child Nodes of "+nodeName+" are:");
      for(int i = 0 ; i<list.getLength() ; i++) //...list them & their children...
        listNodes(list.item(i),indent+" ");     // by calling listNodes() for each  
    }         
  }

  public void fatalError(SAXParseException spe) throws SAXException {
    System.out.println("Fatal error at line "+spe.getLineNumber());
    System.out.println(spe.getMessage());
    throw spe;
  }

  public void warning(SAXParseException spe) {
    System.out.println("Warning at line "+spe.getLineNumber());
    System.out.println(spe.getMessage());
  }

  public void error(SAXParseException spe) {
    System.out.println("Error at line "+spe.getLineNumber());
    System.out.println(spe.getMessage());
  }
}

I have removed the statement outputting details of the parser to reduce the output a little. Run this with the version of Address.xml that includes a DOCTYPE declaration. The program produces quite a lot of output starting with:

DOCTYPE node:
org.apache.crimson.tree.Doctype@decdec

Document body contents are:
address Node, type is org.apache.crimson.tree.ElementNode2:
 <address>
  <buildingnumber> 29 </buildingnumber>
  <street> South Lasalle Street</street>
  <city>Chicago</city>
  <state>Illinois</state>
  <zip>60603</zip>
 /address>
Child Nodes of address are:
 #text Node, type is org.apache.crimson.tree.TextNode:
   
  
 buildingnumber Node, type is org.apache.crimson.tree.ElementNode2:
   <buildingnumber> 29 </buildingnumber>
 Child Nodes of buildingnumber are:
   #text Node, type is org.apache.crimson.tree.TextNode:
     29 
 #text Node, type is org.apache.crimson.tree.TextNode:

and so on down to the last few lines:

zip Node, type is org.apache.crimson.tree.ElementNode2:
   <zip>60603</zip>
Child Nodes of zip are:
   #text Node, type is org.apache.crimson.tree.TextNode:
    60603
 #text Node, type is org.apache.crimson.tree.TextNode:

How It Works

Since we have set the parser configuration in the factory object to include validating the XML, we have to provide an org.xml.sax.ErrorHandler object for the parser. The TryDOM class implements the warning(), error(), and fatalError() methods declared by the ErrorHandler interface so an instance of this class takes care of it.

We call the getDoctype() method for the Document object to obtain the node corresponding to the DOCTYPE declaration:

DocumentType doctype = xmlDoc.getDoctype();       // Get the DOCTYPE node
System.out.println("DOCTYPE node:\n" + doctype);  // and output it

You can see from the output that we only get the class name with a hash code for the object appended. We will see how we can get more detail a little later.

After outputting a header line showing where the document body starts, we output the contents starting with the root element. The listNodes() method does all the work. We pass a reference to the root element that we obtain from the Document object with the statement:

listNodes(xmlDoc.getDocumentElement(),"");         // Root element & children

The first argument to listNodes() is the node to be listed and the second argument is the current indent for output. On each recursive call of the method, we will append a couple of spaces. This will result in each nested level of nodes being indented in the output by two spaces relative to the parent node output.

The first step in the listNodes() method is to get the name of the current node by calling its getNodeName() method:

String nodeName = node.getNodeName();       // Get name of this node

We then output the name of the current node followed by its class name with the statement:

System.out.println(indent + nodeName + " Node, type is " 
                 + node.getClass().getName()+":");

The indent parameter defines the indentation for the current node. Calling getClass() for the node object returns a Class object encapsulating its class type. We then call the getName() method for the Class object to obtain the class type name for the node.

The next statement outputs the node itself:

System.out.println(indent+" "+node);

This will automatically output the string produced by the toString() method for the node. Take a look at the output that corresponds to this as it is quite revealing. The node corresponding to the root element is first, and for this we get the entire document contents generated by the toString() method:

<address>
  <buildingnumber> 29 </buildingnumber>
  <street> South Lasalle Street</street>
  <city>Chicago</city>
  <state>Illinois</state>
  <zip>60603</zip>
</address>

On the basis of this, when you create a new Document object and want to write it as a document to a file, you might be tempted to use the toString() method for the root element to provide all the text for the document body. This would be unwise. It would work for this particular parser but you cannot be sure that another parser will do the same. There is no prescribed string returned by toString() so what you get will depend entirely on the parser and maybe on the particular release of the parser. When you want to write a document to a file, extract the data from the Document object and assemble the text yourself.

The remainder of the listNodes() code iterates through the child nodes of the current node if it has any:

NodeList list = node.getChildNodes();       // Get the list of child nodes
if(list.getLength() > 0) {                  // As long as there are some...
  System.out.println(indent+"Child Nodes of "+nodeName+" are:");
    for(int i = 0 ; i<list.getLength() ; i++) //...list them & their children...
      listNodes(list.item(i),indent+" ");     // by calling listNodes() for each

The for loop simply iterates through the list of child nodes obtained by calling the getChildNodes() method. Each child is passed as an argument to the listNodes() method, which will list the node and iterate through its children. In this way the method will work through all the nodes in the document. You can see that we append an extra couple of spaces to indent in the second argument to the listNodes() call for a child node. The indent parameter in the next level down will reference a string that is two spaces longer. This ensures that the output for the next level of nodes will be indented relative to the current node.

You can see from the output that the output produced by the toString() method for each node by the Crimson parser encompasses its child nodes too. Of course, we get all of them explicitly as nodes in their own right, so there is a lot of duplication in the output. You may have noticed that the output is strange in some ways. We seem to have picked up some extra #text nodes from somewhere that seem to contain just whitespace. Each block of text or whitespace is returned as a node with the name #text, and that includes ignorable whitespace here. The newline characters at the end of each line in the original document, for instance, will contribute text nodes that are ignorable whitespace.

If you don't want to see it, getting rid of the ignorable whitespace is very simple. We just need to set another parser feature in the factory object:

builderFactory.setNamespaceAware(true);        // Set namespace aware
builderFactory.setValidating(true);            // and validating parser features
builderFactory.setIgnoringElementContentWhitespace(true);

Calling this method will result in a parser that will not report ignorable whitespace as a node, so you won't see it in the Document object. If you run the example again with this change, the #text nodes arising from ignorable whitespace will no longer be there. Of course, this is not necessarily a plus since now the output produced by the toString() method is not as readable as it was before because everything appears on a single line.

Node Types

We saw earlier that the subinterfaces of Node identify nodes of different types. Type Element corresponds to an element and type Text identifies element content that is text. We also saw how we could determine the type of a given Node reference using the instanceof operator. There's another way of figuring out what a Node reference is that is often more convenient. The getNodeType() method in the Node interface returns a value of type int that identifies the type of node. It can be any of the following constant values that are defined in the Node interface:

DOCUMENT_NODE	DOCUMENT_TYPE_NODE
ELEMENT_NODE	ATTRIBUTE_NODE
TEXT_NODE	CDATA_SECTION_NODE
DOCUMENT_FRAGMENT_NODE	COMMENT_NODE
ENTITY_NODE	ENTITY_REFERENCE_NODE
NOTATION_NODE	PROCESSING_INSTRUCTION_NODE

In the main it is obvious what type each of these represents. The DOCUMENT_FRAGMENT_NODE represents a collection of elements that form part of a document. The advantage of having the type of node as an integer is that we can sort out what type a given Node reference is by using a switch statement:

switch(node.getNodeType()) {
  case Node.DOCUMENT_NODE:
    // Code to process a document node
    break;
  case Node.DOCUMENT_TYPE_NODE:
    // Code to process a DOCTYPE node
    break;
  case Node.DOCUMENT_NODE:
    // Code to process a document node
    break;
  case Node.ELEMENT_NODE:
    // Code to process an element node
    break;
  // ... and so on for the rest of the type values...
  default:
    assert false;
}

We can include code to process any given node type following the corresponding case label.

There is also an alternative to using getChildNodes() for working through the children of a node. Calling the getFirstChild() method for a Node object returns a reference to its first child node, and the getNextSibling() method returns the next sibling node – the next node on the same level in other words. Both of these methods return null if the child requested does not exist. You can use these in combination to iterate through all the child nodes of a given node. We can illustrate how this works by writing a new version of our listNodes() method:

  static void listNodes(Node node, String indent) {
    String nodeName = node.getNodeName();
    System.out.println(indent+nodeName+" Node, type is "
                                                 +node.getClass().getName()+":");
    System.out.println(indent+" "+node);
    Node childNode = node.getFirstChild();           // Get first child
    while(childNode != null) {                       // While we have a child...
        listNodes(childNode, indent+"  ");              // ...list it, then...
        childNode = childNode.getNextSibling();         // ...get next child
    }
}

As long as childNode is not null the while loop will continue to execute. Within the loop we call listNodes() for the current child then store a reference to the next sibling node in childNode. Eventually getNextSibling() will return null when there are no more child nodes and the loop will end. You can plug this code back into the example if you want to see it in action.

Accessing Attributes

You will usually want to access the attributes for an element, but only if it has some. You can test whether an element has attributes by calling its hasAttributes() method. This will return true if the element has attributes and false otherwise, so you might use it like this:

if(node instanceof Element && node.has Attributes()) {
    // Process the element with its attributes

} else {
    // Process the element without attributes
}

The getAttributes() method for an element returns a NamedNodeMap reference that contains the attributes, the NamedNodeMap interface being defined in the org.w3c.dom package. In general, a NamedNodeMap object is a collection of Node references that can be accessed by name, or serially by iterating through the collection. Since the nodes are attributes in this instance, the nodes will actually be of type Attr.

The NamedNodeMap interface declares the following methods for retrieving nodes from the collection:

item(int index)	Returns the Node reference at index position index.
getLength()	Returns the number of Node references in the collection as type int.
getNamedItem(String name)	Returns the Node reference with the node name name.
getNamedItemNS(String uri, String localName)	Returns the Node reference with the name localName in the namespace at uri.

Obviously the last two methods apply when you know what attributes to expect. We can apply the first two methods to iterate through the collection of attributes in a NamedNodeMap:

if(node instanceof Element && node.hasAttributes()) {
  NamedNodeMap attrs = node.getAttributes();
  for(int i = 0 ; i<attrs.getLength() ; i++) {
          Attr attribute = (Attr)attrs.item(i); // Process the attribute...
  }

} else {   // Process the element without attributes
}

We now are in a position to obtain each of the attributes for an element as a reference of type Attr. To get at the attribute name and value we call the getName() and getValue() methods declared in the Attr interface respectively, both of which return a value of type String.

Try It Out – Listing Elements with Attributes

We can modify the listNodes() method in the previous example to include attributes with the elements. Here's the revised version:

static void listNodes(Node node) {
    String nodeName = node.getNodeName();
    System.out.println(indent+nodeName+" Node, type is "
                                                  +node.getClass().getName()+":");
    System.out.println(indent+" "+node);

    if(node instanceof Element && node.hasAttributes()) {
      System.out.println(indent+"Element Attributes are:");
      NamedNodeMap attrs = node.getAttributes();//...get the attributes
      for(int i = 0 ; i<attrs.getLength() ; i++) {
        Attr attribute = (Attr)attrs.item(i);      // Get an attribute
        System.out.println(indent+attribute.getName()+"="+attribute.getValue());
      }
    }
  
    NodeList list = node.getChildNodes();      // Get the list of child nodes
    if(list.getLength() > 0) {                 // As long as there are some...
      System.out.println(indent+"Child Nodes of "+nodeName+" are:");
      for(int i = 0 ; i<list.getLength() ; i++)// ...list them & their children...
        listNodes(list.item(i),indent+"  ");   // by calling listNodes()  
    }         
}

Don't forget to update the import statements in the example. The complete set will now be:

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.ErrorHandler;
import org.w3c.dom.Document;
import org.w3c.dom.DocumentType;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import org.w3c.dom.Attr;
import org.w3c.dom.NamedNodeMap;
import java.io.File;
import java.io.IOException;

You can recompile the code with these changes and run the example with the circle with DTD.xml file that we created back when we were discussing DTDs. You might want to comment out the call to setIgnoringElementContentWhitespace() to get the ignorable whitespace back in the output.

How It Works

All the new code to handle attributes is in the listNodes() method. After verifying that the current node is an Element node and does have attributes, we get the collection of attributes as a NamedNodeMap object. We then iterate through the collection extracting each node in turn. Nodes are indexed from zero and we obtain the number of nodes in the collection by calling its getLength() method. Since an attribute node is returned by the item() method as type Node, we have to cast the return value to type Attr to call the methods in this interface. We output the attribute and its value, making use of the getName() and getValue() methods for the Attr object in the process of assembling the output string.

The Attr interface also declares a getSpecified() method that returns true if the attribute value was explicitly set in the document rather than being a default value from the DTD. It also declares a getOwnerElement() method that returns an Element reference to the element to which this attribute applies.

You can see from the output that the toString() method for the root element node results in a string containing the entire document body, including the attributes.

Accessing the DOCTYPE Declaration

A node of type DocumentType encapsulates a DOCTYPE declaration and we have already obtained this in the example by calling the getDoctype() method for the Document object. The output we have obtained up to now is a little spartan so now we will remedy that. A DOCTYPE declaration is a little complicated as it may have an internal set of definitions, an external subset specified by a system ID, an external subset specified by public ID, or all three. We need to use the methods declared in the DocumentType interface to figure out what we have in any particular instance.

getName()	Returns the name of the DTD as a String object. This must be the same as the root node name if the document is to be valid.
getSystemID()	Returns a reference to a String object that contains the URI that is the system ID for the external subset.
getPublicID()	Returns a reference to a String object that contains the URI that is the public ID for the external subset.
getInternalSubset()	Returns a reference to a String object that contains the declarations in the internal subset. If there is no internal subset, the method returns null.
getEntities()	Returns a reference to a NamedNodeMap object that contains the general entities declared in the DTD. This will include general entities declared in both the internal and external subset but it does not include parameter entities. A general entity is an entity that is used within the document body. A parameter entity can only be used within the DTD.
getNotations()	Returns a reference to a NamedNodeMap object that contains the notations declared in the DTD. Notations are format declarations for external content such as images.

All these are quite straightforward. Let's see how we can use some of these to include DOCTYPE declarations in the output from the previous example.

Try It Out – Accessing Document Type Nodes

We identify the DocumentType node in our main() method but to avoid bulking main()up any further let's package the processing of a DocumentType node in another method, getDoctypeString(). This method will accept a DocumentType reference as an argument and return a complete DOCTYPE declaration as a String object. Here's the code for that:

  private static String getDoctypeString(DocumentType doctype) {
    // Create the opening string for the DOCTYPE declaration with its name
    String str = doctype.getName();
    StringBuffer doctypeStr = new StringBuffer("<!DOCTYPE ").append(str);        

    // Check for a system ID or a public ID
    final char QUOTE = '\"'; 
    if((str = doctype.getSystemId()) != null) 
      doctypeStr.append(" SYSTEM ").append(QUOTE).append(str).append(QUOTE);
    else if((str = doctype.getPublicId()) != null) // Check for a public ID
      doctypeStr.append(" PUBLIC ").append(QUOTE).append(str).append(QUOTE);

    // Check for an internal subset
    if((str = doctype.getInternalSubset()) != null)
      doctypeStr.append('[').append(str).append(']');           

    return doctypeStr.append('>').toString();  // Append '>', return the string
  }

Now we can amend main() to call this method:

DocumentType doctype = xmlDoc.getDoctype();                    // Get DOCTYPE node
System.out.println("DOCTYPE node:\n" + getDoctypeString(doctype));  // & output it

Here we are replacing the output statement we had previously with this one that calls our new method.

You can try this out with the circle with DTD.xml file or maybe some other XML files with DTD declarations. At the time of writing, the version of Crimson that is distributed with SDK 1.4 does not return the internal subset when the getInternalSubset() method is called. If you want to see this output, try installing the Xerces parser. With Xerces 1.4.2 I got the output:

DOCTYPE node:
<!DOCTYPE circle[
<!ELEMENT circle (position)>
<!ELEMENT position EMPTY>
<!ATTLIST circle 
 radius CDATA #REQUIRED
 >
<!ATTLIST position 
 x CDATA #REQUIRED
  y CDATA #REQUIRED
  >
]>

Document body contents are:
circle Node, type is org.apache.xerces.dom.DeferredElementImpl:
 [circle: null]
Element Attributes are:
radius=15
Child Nodes of circle are:
#text Node, type is org.apache.xerces.dom.DeferredTextImpl:
[#text: 
]
position Node, type is org.apache.xerces.dom.DeferredElementImpl:
[position: null]
Element Attributes are:
x=30
y=50
#text Node, type is org.apache.xerces.dom.DeferredTextImpl:
[#text: 
]

You can see that the output from the toString() method for a node is rather different here from that produced by the Crimson parser.

How It Works

All the work is done in the getDoctypeString() method. It starts out by forming a basic string in a StringBuffer object:

String str = doctype.getName();
StringBuffer doctypeStr = new StringBuffer("<!DOCTYPE ").append(str);

This will produce a string of the form "<!DOCTYPE rootname". We can now append any further bits of the declaration to this string and close it off with a '>' character at the end.

We have defined the char constant, QUOTE, to make the code a little easier to read. We use this when we check for a system ID or a public ID:

if((str = doctype.getSystemId()) != null) 
  doctypeStr.append(" SYSTEM ").append(QUOTE).append(str).append(QUOTE);
else if((str = doctype.getPublicId()) != null) // Check for a public ID
  doctypeStr.append(" PUBLIC ").append(QUOTE).append(str).append(QUOTE);

This reuses the str variable to store the reference returned by the getSystemID() method. If this is not null, we append the SYSTEM keyword followed by the system ID itself, inserting the necessary spaces and double quotes at the appropriate points. Otherwise we look for a public ID and if it exists we append that to the string in a similar fashion.

Next we check for an internal subset of definitions:

if((str = doctype.getInternalSubset()) != null)
  doctypeStr.append('[').append(str).append(']');

If there is an internal subset string we append that too, topped and tailed with square brackets. The final step is to append a closing '>' character and create a String object from the StringBuffer object before returning it.

return doctypeStr.append('>').toString();  // Append '>' and return the string

I'll bet that was a whole lot easier than you expected. We will now put DOM into reverse and look into how we can synthesize XML documents.

Creating XML Documents

You can create an XML document in a file programmatically by a two-step process. You can first create a Document object that encapsulates what you want in your XML document. Then you can use the Document object to create the hierarchy of elements that has to be written to the file. We will first look at how we create a suitable Document object.

The simplest way to create a Document object programmatically is to call the newDocument() method for a DocumentBuilder object and it will return a reference to a new empty Document object:

Document newDoc = builder.newDocument();

This is rather limited, especially since there's no way with DOM2 to modify the DocumentType node to reflect a suitable DOCTYPE declaration.

There's an alternative approach that provides a bit more flexibility but it is not quite so direct. You first call the getDOMImplementation() method for the DocumentBuilder object:

DOMImplementation domImpl = builder.getDOMImplementation();

This returns a reference of type DOMImplementation to an object that encapsulates the underlying DOM implementation. This interface type is defined in the org.w3c.dom package.

There are three methods you can call for a DOMImplementation object:

createDocument(String namespaceURI, String qualifiedName, DocumentType doctype)

Creates a Document object with the root element having the name qualifiedName that is defined in the namespace specified by namespaceURI. The third argument specifies the DOCTYPE node to be added to the document. If you don't want to declare a DOCTYPE, then doctype can be specified as null.

This method will throw an exception of type DOMException if the second argument is incorrect in some way.

createDocumentType(String qualifiedName, String publicID, String systemID)

Creates a node of type DocumentType that represents a DOCTYPE declaration. The first argument is the qualified name of the root element, the second argument is the public ID of the external subset of the DTD, and the third argument is its system ID. This method will also throw an exception of type DOMException if the first argument contains an illegal character or is not of the correct form.

hasFeature(String feature, String version)

Returns true if the DOM implementation has the feature with the name feature. The second argument specifies the DOM version number of the feature and can be either "1.0" or "2.0" with DOM level 2.

You can see from the first two methods here that there is a big advantage to using a DOMImplementation object to create a document. First of all, you can create a DocumentType object by calling the createDocumentType() method:

DocumentType doctype = null;
try {     
  doctype = domImpl.createDocumentType("sketch", null, "sketcher.dtd");

} catch(DOMException e) {
  // Handle the exception
}

This code fragment creates a DocumentType node for an external DOCTYPE declaration with the name sketch, with the system ID sketcher.dtd. There is no public ID in this case since we specified the second argument as null. You can now use the DocumentType object in the creation of a document:

Document newDoc = null;
try {     
  doctype = domImpl.createDocumentType("sketch", null, "sketcher.dtd");
  newDoc = domImpl.createDocument(null, "sketch", doctype);
}
catch(DOMException e) {
  // Handle the exception
}

The DOMException object that may be thrown by either of these two methods has a public field with the name code that is of type int. This stores an error code identifying the type of error that caused the exception so you can check the value of code to determine the cause of the error. This exception can be thrown by a number of different methods that you use to create nodes in a document so the values that code can have is not limited to the two methods we have just used. There are fifteen possible values for code that are defined in the DOMException class but obviously you would only check for those that apply to the code in the try block where the exception may arise.

The possible values for code in a DOMException object thrown by the createDocument() method are:

INVALID_CHARACTER_ERR	The second argument specifying the root element name contains an invalid character.
NAMESPACE_ERR	The qualified name of the root element is malformed in some way, or the first argument specifying the namespace URI is null but the root element name has a prefix, or the qualified name of the root element has a prefix.
WRONG_DOCUMENT_ERR	The document does not support the DocumentType node specified.

The createDocumentType() method can also throw an exception of type DOMException with code set to either of the first two values above.

We therefore might code the catch block in the previous fragment like this:

catch(DOMException e) {
  switch(e.code) {
    case DOMException.INVALID_CHARACTER_ERR:
      System.err.println("Qualified name contains an invalid character");
      break;
    case DOMException.NAMESPACE_ERR:
      System.err.println("Qualified name is malformed or invalid");
      break;
    case DOMException.WRONG_DOCUMENT_ERR:
      System.err.println("Document does not support this doctype");
      break;
    default:
      assert false;
    System.err.println(e.getMessage());
  }
}

Of course, you can also output the stack trace, return from the method, or even end the program here if you want.

Adding to a Document

The Document interface declares methods for adding nodes to a Document object. You can create nodes encapsulating elements, attributes, text, entity references, comments, CDATA sections, and processing instructions so you can assemble a Document object representing a complete XML document. The methods declared by the Document interface are:

createElement(String name)	Returns a reference to an Element object encapsulating an element with name as the tag name. The method will throw an exception of type DOMException with INVALID_CHARACTER_ERR set if name contains an invalid character.
createElementNS(String nsURI, String qualifiedName)	Returns a reference to an Element object encapsulating an element with qualifiedName as the tag name in the namespace nsURI. The method will throw an exception of type DOMException with INVALID_CHARACTER_ERR set if qualifiedName contains an invalid character or NAMESPACE_ERR if it has a prefix "xml" and nsURI is not http://www.w3.org/XML/1998/namespace.
createAttribute(String name)	Returns a reference to an Attr object encapsulating an attribute with name as the attribute name and its value as "". The method will throw an exception of type DOMException with INVALID_CHARACTER_ERR set if name contains an invalid character.
createAttribute( String nsURI, String qualifiedName)	Returns a reference to an Attr object encapsulating an attribute with qualifiedName as the attribute name in the namespace nsURI and its value as "". The method will throw an exception of type DOMException with INVALID_CHARACTER_ERR set if the name contains an invalid character or NAMESPACE_ERR if the name conflicts with the namespace.
createTextNode(String text)	Returns a reference to a Text node containing the string text.
createComment(String comment)	Returns a reference to a Comment node containing the string comment.
createCDATASection(String data)	Returns a reference to a CDATASection node with the value data. Throws a DOMException if you try to create this node if the Document object encapsulates an HTML document.
createEntityReference(String name)	Returns a reference to an EntityReference node with the name specified. Throws a DOMException with the code INVALID_CHARACTER_ERR if name contains invalid characters and NOT_SUPPORTED_ERR if the Document object is an HTML document.
createProcessingInstruction(String target, String name)	Returns a reference to a ProcessingInstruction node with the specified name and target. Throws a DOMException with the code INVALID_CHARACTER_ERR if target contains illegal characters and NOT_SUPPORTED_ERR if the Document object is an HTML document.
createDocumentFragment()	Creates an empty DocumentFragment object. You can insert a DocumentFragment object into a Document object using methods that the Document interface (and the DocumentFragment interface) inherits from the Node interface. You can use the same methods to insert nodes into a DocumentFragment object.

The references to HTML in the table above arise because a Document object can be used to encapsulate an HTML document. Our interest is purely XML so we won't be discussing this aspect further.

Of course, having a collection of nodes within a document does not define any structure. In order to establish the structure of a document you have to associate each attribute node that you have created with the appropriate element, and you must also make sure that each element other than the root is a child of some element. Along with all the other types of node, the Element interface inherits two methods from the Node interface that enable you to make one node a child of another:

appendChild(Node child)	Appends the node child to the end of the list of existing child nodes. This method throws a DOMException with the code HIERARCHY_REQUEST_ERR if the current node does not allow children, the code WRONG_DOCUMENT_ERR if child belongs to another document, or the code NO_MODIFICATION_ALLOWED_ERR if the current node is read-only.
insertBefore(Node child, Node existing)	Insert child as a child node immediately before existing in the current list of child nodes. This method can throw DOMException with the same error codes as above, plus NOT_FOUND_ERR if existing is not a child of the current node.

The Element interface also declares four methods for adding attributes:

setAttributeNode(Attr attr)	Adds the node attr to the element. If an attribute node with the same name already exists, it will be replaced by attr. The method returns either a reference to an existing Attr node that has been replaced or null. The method can throw a DOMException with the following codes: WRONG_DOCUMENT_ERR if attr belongs to another document, NO_MODIFICATION_ALLOWED_ERR if the element is read-only, INUSE_ATTRIBUTE_ERR if attr already belongs to another element.
setAttributeNodeNS(Attr attr)	As above but applies to an element defined within a namespace.
setAttribute(String name, String value)	Add a new attribute node with the specified name and value. If the attribute has already been added, its value is changed to value. The method can throw DOMException with the codes: INVALID_CHARACTER_ERR if name contains an illegal character, NO_MODIFICATION_ALLOWED_ERR if the element is read-only.
setAttributeNS(String nsURI, String qualifiedName, String value)	As above but with the attribute within the namespace nsURI. In addition this method can throw a DOMException with the code NAMESPACE_ERR if qualifiedName is invalid or not within the namespace.

Since we know enough about constructing a Document object to have a stab at putting together an object encapsulating a real XML document, let's have a stab at it.

Storing a Sketch as XML

We have already defined a DTD in the previous chapter that is suitable for defining a sketch. We can see how we can put together the code to store a sketch as an XML document instead of as a serialized object. Obviously we'll use the DTD we already have, and we can create a Document object with a DocumentType node via a DOMImplementation object from a DocumentBuilder object. We can do this with two statements in a try block:

Document doc = null;
try {
  DOMImplementation domImpl = DocumentBuilderFactory.newInstance()
                                                      .newDocumentBuilder()
                                                        .getDOMImplementation();
  doc = domImpl.createDocument(null, "sketch",
                    domImpl.createDocumentType("sketcher", null, "sketcher.dtd"));

} catch(ParserConfigurationException e) {
  e.printStackTrace(System.err);
  // Display the error and terminate the current activity...

} catch(DOMException e) {
  e.printStackTrace(System.err);
  // Determine the kind of error from the error code, 
  // display the error, and terminate the current activity...
}

They are rather long statements since they accomplish in a single statement what we previously did in several steps. However, they are quite simple. The first statement creates a DocumentBuilderFactory object from which a DocumentBuilder object is created from which a reference DOMImplementation object is obtained and stored in domImpl. This is used in the next statement to create the Document object for a sketch and its DocumentType object defining the DOCTYPE declaration for sketcher.dtd. Eventually we will add this code to the SketchModel class but let's leave that to one side for the moment while we look at how we can fill out the detail of the Document object from the objects representing elements in a sketch.

A sketch in XML is a simple two-level structure. The root node in an XML representation of a sketch will be a <sketch> element, so to define the structure we only need to add an Element node to the content for the root node for each element in the sketch. A good way to implement this would be to add a method to each of the sketch Element classes that adds its own org.w3c.dom.Element node to the Document object. This will make each object representing a sketch element able to create its own XML representation.

The Sketcher classes we have to modify are the inner classes to the Element class, plus the Element class itself. The inner classes are Element.Line, Element.Rectangle, Element.Circle, Element.Curve, and Element.Text. The nodes that have to be added for each kind of geometric element derive directly from the declaration in the DTD, so it will help if you have this to hand while we go through these classes. If you typed it in when we discussed it in the last chapter, maybe you can print a copy.

Adding Element Nodes

Polymorphism is going to be a big help in this so let's first define an abstract method in the Element base class to add an element node to a document. We can add the declaration immediately after the declaration for the abstract draw() method, like this:

  public abstract void draw(Graphics2D g2D);
  public abstract void addElementNode(Document document);

Each of the inner classes will need to implement this method since they are derived from the Element class.

We will need a couple of import statement at the beginning of our Element.java file in Sketcher:

import org.w3c.dom.Document;
import org.w3c.dom.Attr;

Note that we definitely don't want to use the * notation to import all of the names from this package. If we do, we will get our sketcher Element class confused with the Element interface in the org.w3c.dom package. We are going to have to use qualified names wherever there is a potential clash.

The XML elements that we will create for geometric elements in a sketch will all need <position> and <color> elements as children. If we define methods in the base class Element, to create these, they will be inherited in each of the subclasses of Element Here's how we can define a method in the Element class to create a <color> element:

protected org.w3c.dom.Element createColorElement(Document doc) {
    org.w3c.dom.Element colorElement = doc.createElement("color");
 
    Attr attr = doc.createAttribute("R");
    attr.setValue(String.valueOf(color.getRed()));
    colorElement.setAttributeNode(attr);  

    attr = doc.createAttribute("G");
    attr.setValue(String.valueOf(color.getGreen()));
    colorElement.setAttributeNode(attr);  

    attr = doc.createAttribute("B");
    attr.setValue(String.valueOf(color.getBlue()));
    colorElement.setAttributeNode(attr);
    return colorElement;      
}

The method for creating the node for a position element will use essentially the same process, but we have several nodes that represent points that are the same apart from their names. We can share the code by putting it into a method that we call with the appropriate type name:

protected org.w3c.dom.Element createPointTypeElement(Document doc,
                                                     String name,
                                                     String xValue,
                                                     String yValue) { 
    org.w3c.dom.Element element = doc.createElement(name);

    Attr attr = doc.createAttribute("x");         // Create attribute x
    attr.setValue(xValue);                        // and set its value
    element.setAttributeNode(attr);               // Insert the x attribute   

    attr = doc.createAttribute("y");              // Create attribute y
    attr.setValue(yValue);                        // and set its value
    element.setAttributeNode(attr);               // Insert the y attribute   
    return element;              
}

This will create an element with the name specified by the second argument so we can use this in a method in the Element class to create a node for a <position> element:

  protected org.w3c.dom.Element createPositionElement(Document doc) {
    return createPointTypeElement(doc, "position",
                                  String.valueOf(position.getX()),
                                  String.valueOf(position.getY()));
  }

We will be able to create <endpoint>, <bottomright>, or <point> nodes in the same way in methods in the subclasses of Element.

Adding a Line Node

The method to add a <line> node to the Document object will create a <line> element with an angle attribute, and then add three child elements: <color>, <position>, and <endpoint>. You can add the following implementation of the addElementNode() method to the Element.Line class:

public void addElementNode(Document doc) {
  org.w3c.dom.Element lineElement = doc.createElement("line");
 
  // Create the angle attribute and attach it to the <line> node
  Attr attr = doc.createAttribute("angle");
  attr.setValue(String.valueOf(angle));
  lineElement.setAttributeNode(attr);
  
  // Append the <color>, <position>, and <endpoint> nodes as children
  lineElement.appendChild(createColorElement(doc));
  lineElement.appendChild(createPositionElement(doc));
  lineElement.appendChild(createEndpointElement(doc));
  
  // Append the <line> node to the document root node
  doc.getDocumentElement().appendChild(lineElement); 
}

When we have a <Line> element in a sketch, calling this method with a reference to a Document object as an argument will add a child node corresponding to the <line> element. To complete this we must add the createEndpointElement() to the Element.Line class:

private org.w3c.dom.Element createEndpointElement(Document doc) {
  return createPointTypeElement(doc, "endpoint",
                                String.valueOf(line.x2+position.x),
                                String.valueOf(line.y2+position.y));
}

This calls the createPointTypeElement() method that is inherited from the base class. Since the position of a line is recorded in the base class and the end point of the line is relative to that point, we must add the coordinates of position in the base class to the coordinates of the end point of the line to get the original end point coordinates back.

Adding a Rectangle Node

The code to add a <rectangle> node to the Document object will be almost the same as adding a <line> node:

public void addElementNode(Document doc) {
  org.w3c.dom.Element rectElement = doc.createElement("rectangle");
 
  // Create the angle attribute and attach it to the <rectangle> node
  Attr attr = doc.createAttribute("angle");
  attr.setValue(String.valueOf(angle));
  rectElement.setAttributeNode(attr);
 
  // Append the <color>, <position>, and <bottomright> nodes as children
  rectElement.appendChild(createColorElement(doc));
  rectElement.appendChild(createPositionElement(doc));
  rectElement.appendChild(createBottomrightElement(doc));
 
  doc.getDocumentElement().appendChild(rectElement);
}

We also must define the createBottomrightElement() method in the Element.Rectangle class:

private org.w3c.dom.Element createBottomrightElement(Document doc) {
  return createPointTypeElement(doc, "bottomright",
                                String.valueOf(rectangle.width+position.x),
                                String.valueOf(rectangle.height+position.y));
}

A rectangle is defined relative to the origin so we have to adjust the coordinates of the bottom right corner by adding the corresponding position coordinates.

Adding a Circle Node

Creating the node for a <circle> element is not very different:

public void addElementNode(Document doc) {
  org.w3c.dom.Element circleElement = doc.createElement("circle");
 
  // Create the radius attribute and attach it to the <circle> node
  Attr attr = doc.createAttribute("radius");
  attr.setValue(String.valueOf(circle.width/2.0));
  circleElement.setAttributeNode(attr);
  
  // Create the angle attribute and attach it to the <circle> node
  attr = doc.createAttribute("angle");
  attr.setValue(String.valueOf(angle));
  circleElement.setAttributeNode(attr);

  // Append the <color> and <position> nodes as children
  circleElement.appendChild(createColorElement(doc));
  circleElement.appendChild(createPositionElement(doc));
 
  doc.getDocumentElement().appendChild(circleElement);
}

There's nothing new here. We can use either the width or the height member of the Ellipse2D.Double class object to get the diameter of the circle. We divide the width field for the circle object by 2.0 to get the radius.

Adding a Curve Node

Creating a <curve> node is a bit more long-winded as a GeneralPath object represents a curve, and we have to extract the arbitrary number of defining points from it. The code that does this is more or less what we used in the writeObject() method for a curve so it is nothing new:

public void addElementNode(Document doc) {
  org.w3c.dom.Element curveElement = doc.createElement("curve");
  
  // Create the angle attribute and attach it to the <curve> node
  Attr attr = doc.createAttribute("angle");
  attr.setValue(String.valueOf(angle));
  curveElement.setAttributeNode(attr);
   
  // Append the <color> and <position> nodes as children
  curveElement.appendChild(createColorElement(doc));
  curveElement.appendChild(createPositionElement(doc));
  
  // Get the defining points via a path iterator
  PathIterator iterator = curve.getPathIterator(new AffineTransform());
  int maxCoordCount = 6;                  // Maximum coordinates for a segment
  float[] temp = new float[maxCoordCount];           // Stores segment data
  
  int result = iterator.currentSegment(temp);        // Get first segment
  assert result == iterator.SEG_MOVETO;              // ... should be move to
  
  iterator.next();                                   // Next segment
  while(!iterator.isDone())   {                      // While we have segments
    result = iterator.currentSegment(temp);          // Get the segment data
    assert result == iterator.SEG_LINETO;            // Should all be lines
   
    // Create a <point> node and add it to the list of children
    curveElement.appendChild(createPointTypeElement(doc, "point",
                                  String.valueOf(temp[0]+position.x),
                                  String.valueOf(temp[1])+position.y));
    iterator.next();                                   // Go to next segment
  }
  
  doc.getDocumentElement().appendChild(curveElement);
}

We add one <point> node as a child of the Element node for a curve for each defining point after the first. Since the defining points for the GeneralPath object were created relative to the origin, we have to add the corresponding coordinates of position to the coordinates of each defining point.

Adding a Text Node

A text node is a little different and involves quite a lot of code. As well as the usual <color> and <position> child nodes, we also have to append a <font> node to define the font and a <string> node. The <font> node has three attributes that define the font name, the font style, and the point size. The <string> node has the text well as a <bounds> element that has two attributes defining the width and height of the text. Here's the code:

public void addElementNode(Document doc) {
  org.w3c.dom.Element textElement = doc.createElement("text");
 
  // Create the angle attribute and attach it to the <text> node
  Attr attr = doc.createAttribute("angle");
  attr.setValue(String.valueOf(angle));
  textElement.setAttributeNode(attr);
     
  // Append the <color> and <position> nodes as children
  textElement.appendChild(createColorElement(doc));
  textElement.appendChild(createPositionElement(doc));
   
  // Create and apppend the <font> node 
  org.w3c.dom.Element fontElement = doc.createElement("font");
  attr = doc.createAttribute("fontname");
  attr.setValue(font.getName());
  fontElement.setAttributeNode(attr);
  
  attr = doc.createAttribute("fontstyle");
  String style = null;
  int styleCode = font.getStyle();
  if(styleCode == Font.PLAIN)
    style = "plain";
  else if(styleCode == Font.BOLD)
    style = "bold";
  else if(styleCode == Font.ITALIC)
    style = "italic";
  else if(styleCode == Font.ITALIC+Font.BOLD)
      style = "bold-italic";
  assert style != null;
  attr.setValue(style);
  fontElement.setAttributeNode(attr);
 
  attr = doc.createAttribute("pointsize");
  attr.setValue(String.valueOf(font.getSize()));
  fontElement.setAttributeNode(attr);
  textElement.appendChild(fontElement);
 
  // Create the <string> node
  org.w3c.dom.Element string = doc.createElement("string");
 
  // Create the <bounds> node and its attributes
  org.w3c.dom.Element bounds = doc.createElement("bounds");
  attr = doc.createAttribute("width");
  attr.setValue(String.valueOf(this.bounds.width));
  bounds.setAttributeNode(attr);
  attr = doc.createAttribute("height");
  attr.setValue(String.valueOf(this.bounds.height));
  bounds.setAttributeNode(attr);
  string.appendChild(bounds);      // Set <bounds> element as <string> content

  string.appendChild(doc.createTextNode(text));
  textElement.appendChild(string);// Set <text> as <string> content
  doc.getDocumentElement().appendChild(textElement);
}

Since the font style can be "plain", "bold", "bold-italic", or just "italic", we have a series of if statement to determine the value for the attribute. The style is stored in a Font object as an integer with different values for plain, bold, and italic. The values corresponding to bold and italic can be combined, in which case the style is "bold-italic".

All the element objects in a sketch can now add their own node to a Document object. We should now be able to make a SketchModel object use this capability to create a document that encapsulates the entire sketch.

Creating a Document Object for a Complete Sketch

We can add a createDocument() method to the SketchModel class to create a Document object and populate it with the nodes for the elements in the current sketch model. Creating the Document object will use the code fragment we saw earlier. You need to add some import statements at the beginning of the SketchModel.java source file for the new interfaces and classes we will be using:

import javax.swing.JOptionPane;
import javax.xml.parsers.*;
import org.w3c.dom.Document;
import org.w3c.dom.DOMImplementation;
import org.w3c.dom.DOMException;

Here's the method definition you can add to the class:

  // Creates a DOM Document object encapsulating the current sketch  
  public Document createDocument() {
    Document doc = null;
    try {
      DOMImplementation domImpl = DocumentBuilderFactory.newInstance()
                                                        .newDocumentBuilder()
                                                        .getDOMImplementation();
      doc = domImpl.createDocument(null, "sketch",
                   domImpl.createDocumentType("sketcher", null, "sketcher.dtd"));

    } catch(ParserConfigurationException e) {
      JOptionPane.showInternalMessageDialog(null,
                  "Parser configuration error while creating document",
                  "DOM Parser Error",
                  JOptionPane.ERROR_MESSAGE); 
      System.err.println(e.getMessage());
      e.printStackTrace(System.err);
      return null;

    } catch(DOMException e) {
      JOptionPane.showInternalMessageDialog(null,
                  "DOM exception thrown while creating document",
                  "DOM Error",
                  JOptionPane.ERROR_MESSAGE); 
      System.err.println(e.getMessage());
      e.printStackTrace(System.err);
      return null;
    }

    // Each element in the sketch can create its own node in the document
    Iterator iter = getIterator();                 // Iterator for sketch elements
    while(iter.hasNext())                          // For each element...
      ((Element)iter.next()).addElementNode(doc);  // ...add its node.
    return doc;
  }

Now notice that this requires the DTD file for sketcher in the same folder as a saved sketch. In our case, we've made the default c:\sketches so a copy will need to be present there.

We now pop up a dialog and return null if something goes wrong when we are creating the Document object. In case of an exception of type DOMException being thrown, you could add a switch statement to analyze the value in the code member of the exception and provide a more specific message in the dialog.

The SketchModel object can now create a DOM Document object encapsulating the entire sketch. All we now need is some code to use this to write an XML file.

Saving a Sketch as XML

Of course, we could modify Sketcher so that it could save sketches either as objects or as XML, but to keep things simple we will add menu items to the File menu to export or import a sketch as XML. In broad terms, here's what we have to do to the SketchFrame class to save a sketch as an XML file:

Add Import XML and Export XML menu items.
Add XML ImportAction and XMLExportAction inner classes defining the Action types for the new menu items, either to save the current sketch as an XML file or to replace the current sketch by a new sketch created from an XML file.
Implement the process of creating an XML document as text from the Document object created by the createDocument() method that we added to the SketchModel class.
Implement writing the text for the XML document to a file.

By adding new Action classes for our two new menu items, we avoid cluttering up the existing FileAction class any further. Clearly, a lot of the work will be in the implementation of the new Action classes, so let's start with the easy bit – adding the new menu items to the File menu. First, we can add two new fields for the menu items by changing the existing definition in the SketchFrame class:

private FileAction newAction,  openAction,   closeAction,
                   saveAction, saveAsAction, printAction;
private XMLExportAction exportAction;    // Stores action for XML export menu item
private XMLImportAction importAction;    // Stores action for XML import menu item

These store the references to the Action objects for the new menu items.

We can add the menu items in the SketchFrame constructor, immediately following the menu separator definition that comes after the saveAsAction menu item:

addMenuItem(fileMenu, saveAction);
addMenuItem(fileMenu, saveAsAction);
fileMenu.addSeparator();                                      // Add separator
addMenuItem(fileMenu, 
            exportAction = new XMLExportAction("Export XML",
                                               "Export sketch as an XML file"));
addMenuItem(fileMenu, 
            importAction = new XMLImportAction("Import XML",
                                               "Import sketch from an XML file"));
fileMenu.addSeparator();                                    // Add separator

Now we can add code in the SketchFrame class for the two inner classes. We can define the ExportAction class within the SketchFrame class like this:

  class XMLExportAction extends AbstractAction {
    public XMLExportAction(String name, String tooltip) {
      super(name);
      if(tooltip != null)                             // If there is tooltip text
        putValue(SHORT_DESCRIPTION, tooltip);         // ...squirrel it away
    }

    public void actionPerformed(ActionEvent e) {
        JFileChooser chooser = new JFileChooser(DEFAULT_DIRECTORY);
        chooser.setDialogTitle("Export Sketch as XML");
        chooser.setApproveButtonText("Export");
        ExtensionFilter xmlFiles = new ExtensionFilter(".xml",
                                                      "XML Sketch files (*.xml)");
        chooser.addChoosableFileFilter(xmlFiles);             // Add the filter
        chooser.setFileFilter(xmlFiles);                      // and select it
        int result = chooser.showDialog(SketchFrame.this, null); // Show dialog
        File file = null;
        if(chooser.showDialog(SketchFrame.this, null) == chooser.APPROVE_OPTION){
          file = chooser.getSelectedFile();
          if(file.exists()) {                           // Check file exists
            if(JOptionPane.NO_OPTION ==                 // Overwrite warning
               JOptionPane.showConfirmDialog(SketchFrame.this,
                                  file.getName()+" exists. Overwrite?",
                                  "Confirm Save As",
                                  JOptionPane.YES_NO_OPTION,
                                  JOptionPane.WARNING_MESSAGE))
               return;                                   // No overwrite
          }
          saveXMLSketch(file);
      }
    }
  }

This is very similar to code that appears in the FileAction class. The constructor only provides for what we use – a menu item name plus a tooltip. If you want to have the option for an icon for use on a toolbar button, you can add that in the same way as for the FileAction constructors. The actionPerformed() method pops up a CFFileChooser dialog to enable the destination file for the XML to be selected. The chosen file is passed to a new method that we will put together, saveXMLSketch(), which will handle writing the XML document to the file.

We can define the XMLImportAction inner class like this:

  class XMLImportAction extends AbstractAction {
    public XMLImportAction(String name, String tooltip) {
      super(name);
      if(tooltip != null)                             // If there is tooltip text
        putValue(SHORT_DESCRIPTION, tooltip);         // ...squirrel it away
    }

    public void actionPerformed(ActionEvent e) {
      JFileChooser chooser = new JFileChooser(DEFAULT_DIRECTORY);
      chooser.setDialogTitle("Import Sketch from XML");
      chooser.setApproveButtonText("Import");
      ExtensionFilter xmlFiles = new ExtensionFilter(".xml",
                                                    "XML Sketch files (*.xml)");
      chooser.addChoosableFileFilter(xmlFiles);             // Add the filter
      chooser.setFileFilter(xmlFiles);                      // and select it
      int result = chooser.showDialog(SketchFrame.this, null);  // Show dialog
      if(chooser.showDialog(SketchFrame.this, null) == chooser.APPROVE_OPTION)
        openXMLSketch(chooser.getSelectedFile());     
    }
  }

This is more of the same but in the opposite direction as Stanley might have said. Once the name of the file to be imported has been identified in the JFileChooser dialog, we call openXMLSketch() to read the XML from the file and create the corresponding sketch.

Now we can go on to the slightly more difficult bits. We will start by looking at how we can write an XML document to a file, since we can't test the process for reading a sketch as XML until we have written some.

Writing the XML File

Before we start, let's add a few constants to our Constants interface in the Sketcher code:

  String QUOTE_ENTITY = "&quot;";
  char QUOTE = '\"';
  char NEWLINE = '\n';
  char TAG_START = '<';
  char TAG_END = '>';
  String EMPTY_TAG_END = "/>";
  String END_TAG_START = "</";

We will standardize on using a double quote as a string delimiter in the XML that we will generate. We will therefore substitute the QUOTE_ENTITY constant for any double quotes that appear in the text for a Sketcher Text element. The other constants will come in useful when we are assembling XML markup.

We will make the saveXMLSketch() method a member of the SketchFrame class. This method will obtain a FileChannel object for the File object that is passed as an argument. The FileChannel object can then be used to write the XML to the file. Here's how we can define this method:

  private void saveXMLSketch(File outFile) {
    FileOutputStream outputFile = null;      // Stores an output stream reference 
    try {
      outputFile = new FileOutputStream(outFile);    // Output stream for the file 
      FileChannel outChannel = outputFile.getChannel(); // Channel for file stream 
      writeXMLFile(theApp.getModel().createDocument(), outChannel);

    } catch(FileNotFoundException e) {
      e.printStackTrace(System.err);
      JOptionPane.showMessageDialog(SketchFrame.this,
                       "Sketch file " + outFile.getAbsolutePath() + " not found.",
                       "File Output Error",
                       JOptionPane.ERROR_MESSAGE);
      return;                           // Serious error – return
    }
  }

This calls another method that we have yet to write. The writeXMLFile() method will assemble the XML from the Document object passed as the first argument, and write that to the FileChannel referenced by the second argument.

We don't really expect to end up in the catch block. If we do, something is seriously wrong somewhere. Don't forget to import the FileChannel class name. The import statement you must add to SketchFrame is:

import java.nio.channels.FileChannel;

The DOM Document object provides no convenient way to get a complete XML document as a string or series of strings. The toString()method for a Node object looked hopeful in this respect – at least for the Crimson parser, but we saw that what the toString() method produced depended on the parser so we can't rely on that. The only way is for us to slog it out for ourselves. Our writeXMLFile() method will have to navigate the Document object and its nodes in order to create all the well-formed and valid XML that has to be written to the file to form a complete XML document.

Creating an XML document won't be difficult. We already know how to navigate a Document object and write the nodes to the command line. We did that in an example a few pages back. We will need to make sure the code we use here writes everything we need to produce well-formed XML but it will be essentially the same as what we have seen. The only difference here is that we are writing to a file channel rather than the command line but that should not be any trouble since we know how to do that, too. If we take a little care in the appearance of the XML, we should be able to end up with an XML file defining a sketch that is reasonably readable.

Since we want to be able to look at the XML file for a sketch in an editor, we will write is as the 8-bit Unicode subset UTF-8. With all that knowledge and experience, we can implement writeXMLFile() in the SketchFrame class like this:

  private void writeXMLFile(org.w3c.dom.Document doc, FileChannel channel) {
    StringBuffer xmlDoc = new StringBuffer(
                                 "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"); 
    xmlDoc.append(NEWLINE).append(getDoctypeString(doc.getDoctype()));
    xmlDoc.append(getDocumentNode(doc.getDocumentElement(), ""));
 
    try {
      channel.write(ByteBuffer.wrap(xmlDoc.toString().getBytes("UTF-8")));

    } catch(UnsupportedEncodingException e) {
      System.out.println(e.getMessage());

    } catch(IOException e) {
      JOptionPane.showMessageDialog(SketchFrame.this,
                                          "Error writing XML to channel.",
                                              "File Output Error",
                                              JOptionPane.ERROR_MESSAGE);
      e.printStackTrace(System.err);
      return;
    }
  }

Initially we create a StringBuffer object that will eventually contain the entire XML document. It starts out initialized with the XML declaration and we append the text corresponding to the DOCTYPE declaration. We use the getDoctypeString() method to generate this and this method will be virtually identical to the method of the same name from the example earlier in this chapter, as we shall see in a moment. This method accepts an argument of type DocumentType, assembles a complete DOCTYPE delcaration from that, and returns it as type String. This is appended to xmlDoc following a newline character that will start the declaration on a new line.

We introduce another new method in the code for the writeXMLFile() method in the statement:

xmlDoc.append(getDocumentNode(doc.getDocumentElement(), ""));

This is for good reason. We will need a recursive method to navigate the nodes in the Document object that represent the body of the document and create the XML for that. The string that is returned will be the entire document body, so once we have appended this to xmlDoc we have the complete document. We will implement the getDocumentNode() method shortly.

The other statement deserving some explanation is the one in the try block that writes the complete document to the file:

channel.write(ByteBuffer.wrap(xmlDoc.toString().getBytes("UTF-8")));

Starting from the inside, and working outwards: Calling the toString() method for xmlDoc returns the contents as type String. We then call the getBytes() method for the String object to obtain an array of type byte[] containing the contents of the String object encoded as UTF-8. We then call the static wrap() method in the ByteBuffer class (that will need importing) to create a ByteBuffer object that wraps the array. The buffer that is returned has its limit and position set ready for the buffer contents to be written to a file. We can therefore pass this ByteBuffer object directly to the write() method for the FileChannel object to write the contents of the buffer, which will be the entire XML document, to the file. How's that for a powerful statement.

The code for the getDoctypeString() method will be:

   private String getDoctypeString(org.w3c.dom.DocumentType doctype) {
      // Create the opening string for the DOCTYPE declaration with its name
      String str = doctype.getName();
      StringBuffer doctypeStr = new StringBuffer("<!DOCTYPE ").append(str);        

      // Check for a system ID
      if((str = doctype.getSystemId()) != null)
        doctypeStr.append(" SYSTEM ").append(QUOTE).append(str).append(QUOTE);

      // Check for a public ID
      if((str = doctype.getPublicId()) != null)
        doctypeStr.append(" PUBLIC ").append(QUOTE).append(str).append(QUOTE);

      // Check for an internal subset
      if((str = doctype.getInternalSubset()) != null)
        doctypeStr.append('[').append(str).append(']');           

      return doctypeStr.append(TAG_END).toString();  // Append '>' & return string
  }

This is almost identical to the method as implemented in our previous example.

Creating XML for the Document Body

The recursive getDocumentNode() method to assemble the XML for the document body is a little more work than the others but it will work much like the method we wrote earlier to list nodes in a document. The method will find out the specific type of the current node then append the appropriate XML string to a StringBuffer object. If the current node has child nodes, the method will call itself to deal with each of these nodes. We can implement the writeDocumentNode() method like this:

  private String getDocumentNode(Node node, String indent) {
    StringBuffer nodeStr = new StringBuffer().append(NEWLINE).append(indent); 
    String nodeName = node.getNodeName();       // Get name of this node

    switch(node.getNodeType()) {
      case Node.ELEMENT_NODE:
      nodeStr.append(TAG_START);
      nodeStr.append(nodeName);
      if(node.hasAttributes()) {             // If the element has attributes...
        org.w3c.dom.NamedNodeMap attrs = node.getAttributes();   // ...get them 
        for(int i = 0 ; i<attrs.getLength() ; i++) {
          org.w3c.dom.Attr attribute = (org.w3c.dom.Attr)attrs.item(i);      
          // Append " name="value" to the element string
          nodeStr.append(' ').append(attribute.getName()).append('=')
                        .append(QUOTE).append(attribute.getValue()).append(QUOTE);
        }
      }
      if(!node.hasChildNodes()) {        // Check for no children for this element
        nodeStr.append(EMPTY_TAG_END);   // There are none-close as empty element
        return nodeStr.toString();       // and return the completed element

      } else {                             // It has children
        nodeStr.append(TAG_END);         // so close start-tag
        NodeList list = node.getChildNodes();       // Get the list of child nodes
        assert list.getLength()>0;                  // There must be at least one
 
       // Append child nodes and their children...
        for(int i = 0 ; i<list.getLength() ; i++)     
          nodeStr.append(getDocumentNode(list.item(i), indent+" "));     
      }
      nodeStr.append(NEWLINE).append(indent).append(END_TAG_START)
                                               .append(nodeName).append(TAG_END);
      break;    

      case Node.TEXT_NODE:
      nodeStr.append(replaceQuotes(((org.w3c.dom.Text)node).getData()));
      break;

      default:
      assert false;
    }
    return nodeStr.toString();
  }

We start out by creating the StringBuffer object and appending a newline and the indent to it to make each element start on a new line. This should result in a document we can read comfortably in a text editor.

After saving the name of the current node in nodeName, we determine what kind of node we are dealing with in the switch statement. We could have used the instanceof operator and if statements to do this but here there's a chance to try out the alternative approach that we discussed earlier. We only identify two cases in the switch, corresponding to the constants Node.ELEMENT_NODE and Node.TEXT_NODE. This is because our DTD for Sketcher doesn't provide for any others so we don't expect to find them.

For a node that is an element we begin appending the start tag for the element, including the element name. We then check for the presence of attributes for this element. If there are some, we get them as a NamedNodeMap object in the same manner as our earlier example. We then just iterate through the collection of attributes and build the text that corresponds to each, appending the text to the StringBuffer object nodeStr.

Once we have finished with the attributes for the current node, we determine whether it has child nodes. If it has no child nodes, it has no content, so we can complete the tag for the current node making it an empty element. Since the element is now complete we can return it as a String. If the current element has child nodes we obtain those in a NodeList object. We then iterate through the nodes in the NodeList and call getDocumentNode() for each with an extra space appended to indent. The String that is returned for each call is appended to nodeStr. When all the child nodes have been processed we are done so we can exit the switch and return the contents of nodeStr as a String.

The other possibility is that the current node is text. This will arise from an Element.Text object in the sketch. It is also possible that this text may contain double quotes – the delimiter that we are using for strings in our XML. We therefore call replaceQuotes() to replace all occurrence of QUOTE in the text with the QUOTE_ENTITY constant that we defined in our Constants interface, before appending the string to nodeStr.

We can implement the replaceQuotes() method in SketchFrame as:

  public String replaceQuotes(String str) {
    StringBuffer buf = new StringBuffer();
    for(int i = 0 ; i<str.length() ; i++)
      if(str.charAt(i)==QUOTE)
        buf.append(QUOTE_ENTITY);
      else
        buf.append(str.charAt(i));

    return buf.toString(); 
  }

This just tests each character in the original string. If it's a delimiter for an attribute value, it's replaced by the entity reference " in the output string buf.

Well, we are done – almost. We must not forget the extra import statements we need in the SketchFrame.java file:

import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

These cover the new classes that we used in our new code for I/O and for DOM without using fully qualified names.

Try It Out – Writing a Sketch as XML

Recompile sketcher with the new code. Once you have fixed any errors you can run Sketcher and export your sketch as XML. You can then inspect the file in your editor. You can also process the file with our TryDOM and TrySAX programs to check that the XML is valid.

How It Works

We have been through the detailed mechanics of this. The SketchModel object creates a Document object that is populated by nodes encapsulating the sketch elements. Each node is created by the corresponding sketch element. We then navigate through the nodes in the document to create the XML for each node.

We can now have a go at importing an XML sketch.

Reading an XML Representation of a Sketch

The Import XML operation will also be implemented in the SketchFrame class. We have already added the menu item and the XMLImportAction class that is used to create it. We just need to implement the openXMLSketch() method that is called by the actionPerformed() method in the XMLImportAction class.

Assuming our XML representation of a sketch is well-formed and valid, creating a Document object encapsulating a sketch will be a piece of cake. We will just get a DOM parser to do it – and it will verify that the document is well-formed and valid along the way. We will need an ErrorHandler object to deal with parsing errors, so let's add an inner class to our SketchFrame class for that:

  class DOMErrorHandler implements org.xml.sax.ErrorHandler {
    public void fatalError(org.xml.sax.SAXParseException spe) 
       throws org.xml.sax.SAXException {
      JOptionPane.showMessageDialog(SketchFrame.this,
                                    "Fatal error at line "+spe.getLineNumber()
                                  + "\n"+spe.getMessage(),
                                    "DOM Parser Error",
                                    JOptionPane.ERROR_MESSAGE);
      throw spe;
    }

    public void warning(org.xml.sax.SAXParseException spe) {
      JOptionPane.showMessageDialog(SketchFrame.this,
                                    "Warning at line "+spe.getLineNumber()
                                  + "\n"+spe.getMessage(),
                                    "DOM Parser Error",
                                    JOptionPane.ERROR_MESSAGE);
    }

    public void error(org.xml.sax.SAXParseException spe) {
      JOptionPane.showMessageDialog(SketchFrame.this,
                                    "Error at line "+spe.getLineNumber()
                                  + "\n"+spe.getMessage(),
                                    "DOM Parser Error",
                                    JOptionPane.ERROR_MESSAGE);
    }
  }

This implements the three methods declared in the ErrorHandler interface. In contrast to our previous example using a DOM error handler, rather than writing error information to the command line, here we display it in a suitable dialog.

Here's how we can implement the openXMLSketch() method in the SketchFrame class:

private void openXMLSketch(File xmlFile) {
  DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
  builderFactory.setValidating(true);           // Add validating parser feature
  builderFactory.setIgnoringElementContentWhitespace(true); 
     
  try {
    DocumentBuilder builder = builderFactory.newDocumentBuilder();
    builder.setErrorHandler(new DOMErrorHandler());
    checkForSave();
    theApp.insertModel(createSketchModel(builder.parse(xmlFile)));
    filename = xmlFile.getName();            // Update the file name
    setTitle(frameTitle+xmlFile.getPath());  // Change the window title
    sketchChanged = false;                     // Status is unchanged

  } catch(ParserConfigurationException e) {
    JOptionPane.showMessageDialog(SketchFrame.this,
                                  e.getMessage(),
                                  "DOM Parser Factory Error",
                                  JOptionPane.ERROR_MESSAGE);
    e.printStackTrace(System.err);

  } catch(org.xml.sax.SAXException e) {
     JOptionPane.showMessageDialog(SketchFrame.this,
                                   e.getMessage(),
                                   "DOM Parser Error",
                                   JOptionPane.ERROR_MESSAGE);
    e.printStackTrace(System.err);

  } catch(IOException e) {
    JOptionPane.showMessageDialog(SketchFrame.this,
                                  e.getMessage(),
                                  "I/O Error",
                                  JOptionPane.ERROR_MESSAGE);
    e.printStackTrace();
  }    
}

Most of the code here is devoted to catching exceptions that we hope will not get thrown. We set up the parser factory object to produce a validating parser that will ignore surplus whitespace. The latter feature will avoid extraneous nodes in the Document object that will be created by the parser from the XML file.

After storing a reference to the DOM parser that is created in builder, we create a DOMErrorHandler object and set that as the handler for any parsing errors that arise. If the parser finds any errors, we will see a dialog displayed indicating what the error is. We use the builder object to parse the XML file that is identified by the File object, xmlFile and pass the Document object that is returned by the parse() method to the createSketchModel() method that we will be adding to the SketchFrame class next. This method has the job of creating a new SketchModel object from the Document object.

Let's see how we can create a new SketchModel object encapsulating a new sketch by analyzing the Document object.

Creating the Model

We know that a sketch in XML is a two-level structure. There is a root element, <sketch>, that contains one XML element for each of the elements in the original sketch. Therefore to recreate the sketch, we just need to extract the children of the root node in the Document object and then figure out what kind of sketch element each child represents. Whatever it is, we want to create a sketch element object of that type and add it to a model. The simplest way to create sketch element objects from a given document node is to add a constructor to each of the classes that define sketch elements. We will add these constructors after we have defined the createSketchModel() method in the SketchFrame class. Here's the code for that:

private SketchModel createSketchModel(org.w3c.dom.Document doc) {
   SketchModel model = new SketchModel();                 // The new model object

   // Get the first child of the root node
   org.w3c.dom.Node node = doc.getDocumentElement().getFirstChild();

   // Starting with the first child, check out each child in turn
   while (node != null) {
    assert node instanceof org.w3c.dom.Element;          // Should all be Elements
 
    String name = ((org.w3c.dom.Element)node).getTagName();  // Get the name

    if(name.equals("line"))                               // Check for a line
      model.add(new Element.Line((org.w3c.dom.Element)node)); 

    else if(name.equals("rectangle"))                     // Check for a rectangle
      model.add(new Element.Rectangle((org.w3c.dom.Element)node));  

    else if(name.equals("circle"))                        // Check for a circle
      model.add(new Element.Circle((org.w3c.dom.Element)node));     

    else if(name.equals("curve"))                         // Check for a curve
      model.add(new Element.Curve((org.w3c.dom.Element)node));      

    else if(name.equals("text"))                          // Check for a text
      model.add(new Element.Text((org.w3c.dom.Element)node));   

    node = node.getNextSibling();                          // Next child node
   }
      return model;    
  }

This works in a straightforward fashion. We get the first child node of the root node by calling getDocumentElement() for the document object to obtain a reference to the org.w3c.dom.Element object that encapsulates the root node, then call its getFirstChild() method to obtain a reference of type Node to its first child. All the children of the root element should be Element nodes, and the assertion verifies this.

We determine what kind of element each child node is by checking its name. We call a sketch Element constructor corresponding to the node name to create the sketch element to be added to the model. Each of these constructors creates an object from the org.w3c.dom.Element object reference that is passed as the argument. We just have to implement these constructors in the subclasses of Element and we are done.

Creating Sketch Elements from XML Elements

Every element has to have the color field in the base class set to a color determined from a <color> element in the document. We can therefore usefully add a base class method to take care of this. Add the following to the Element class definition:

  protected void setElementColor(org.w3c.dom.Element colorElement) {
      color = new Color(Integer.parseInt(colorElement.getAttribute("R")),
                        Integer.parseInt(colorElement.getAttribute("G")),
                        Integer.parseInt(colorElement.getAttribute("B")));
  }

The method expects to receive a reference to an org.w3c.dom.Element object as an argument that contains the RGB values for the color. We extract the value of each of the attributes in the colorElement object by calling its getAttribute() method with the attribute name as the argument. We pass each of the values obtained to the Color constructor and we store the reference to this object in color. Because the attribute values are strings, we have to convert them to numerical values using the static parseInt() method that is defined in the Integer class.

The same applies to the position field in the Element class, so we will define a method in the Element class to initialize it from an org.w3c.dom.Element object:

  protected void setElementPosition(org.w3c.dom.Element posElement) {
      position = new Point(); 
      position.setLocation(Double.parseDouble(posElement.getAttribute("x")),
                           Double.parseDouble(posElement.getAttribute("y")));
  }

This uses essentially the same mechanism as the previous method. Here the attributes strings represent double values so we use the static parseDouble() method from the Double class to convert them to the numeric equivalent.

Every sketcher element has a color, a position, and an angle that are stored in base class fields so we can create a base class constructor to initialize these from the document node for the element:

  protected Element(org.w3c.dom.Element xmlElement) {
    // Get the <color> element
    org.w3c.dom.NodeList list = xmlElement.getElementsByTagName("color");
    setElementColor((org.w3c.dom.Element)list.item(0));        // Set the color

    list = xmlElement.getElementsByTagName("position");        // Get <position>
    setElementPosition((org.w3c.dom.Element)list.item(0));     // Set the position

    angle = Double.parseDouble(xmlElement.getAttribute("angle")); // Set the angle
  }

We have declared this constructor as protected to prevent the possibility of it being called externally.

Every one of our new constructors in the inner classes to Element will call this constructor first. An important point to remember is that if a constructor for a derived class object does not call a base class constructor as the first statement in the body of the constructor, the compiler will insert a call to the no-arg constructor for the base class. This means that a base class always has to have a no-arg constructor if the derived class constructors do not call a base class constructor with arguments.

We first extract the child element for the current element with the name "color" by calling the getElementsByTagName() method for xmlElement. This method, declared in the org.w3c.dom.Element interface, returns a NodeList object containing all the child nodes with the given name. If you pass the string "*" as the argument to this method, it will return all child org.w3c.dom.Element objects in the node list. There's another method, getElementsByTagNameNS(), that is declared in the Element interface that does the same for documents using namespaces. The first argument in this case is the namespace URI and the second argument is the element name. The strings to either or both arguments can be "*", in which case all namespaces and/or names will be matched.

We pass the reference to the element with the name "color" to the setElementColor() method that is inherited from the base class. This sets the value of the color field in the base class.

Next we initialize the position field in the Element class by calling the setElementPosition() method. The process is much the same as for the color field. Lastly we set the angle field by converting the string that is the value for the angle attribute for the current node to type double.

Now we are ready to add the new constructors to the subclasses to create sketch elements from XML elements.

Creating a Line Element

We can construct an Element.Line object by first calling the base class constructor we have just defined to set the color, position, and angle fields, and then setting the line field in the derived class. Here's the code for the constructor:

  // Content is <color>, <position>, <endpoint> elements. Attribute is angle.
  public Line(org.w3c.dom.Element xmlElement) {
    super(xmlElement);

    org.w3c.dom.NodeList list = xmlElement.getElementsByTagName("endpoint");
    org.w3c.dom.Element endpoint = (org.w3c.dom.Element)list.item(0);
    line = new Line2D.Double(origin.x, origin.y,
                Double.parseDouble(endpoint.getAttribute("x"))-position.getX(), 
                Double.parseDouble(endpoint.getAttribute("y"))-position.getY());
  }

To save having to refer back to the DTD, the first comment in the constructor outlines the XML corresponding to the element. We first call the base class constructor and then extract the child element that is the <endpoint> element. You will doubtless recall that all our sketch elements are defined at the origin. This makes moving an element very easy and allows all elements to be moved in the same way – by modifying the position field. We therefore create the Line2D.Double object as a line starting at the origin. The coordinates of its end point are the values stored in the <endpoint> child element minus the corresponding coordinates of position that were set in the base class constructor.

Creating a Rectangle Element

This constructor will be almost identical to the previous constructor for a line:

    // Rectangle has angle attribute. Content is <color>,<position>,<bottomright>
    public Rectangle(org.w3c.dom.Element xmlElement) {
      super(xmlElement);

      org.w3c.dom.NodeList list = xmlElement.getElementsByTagName("bottomright");
      org.w3c.dom.Element bottomright = (org.w3c.dom.Element)list.item(0);
      rectangle = new Rectangle2D.Double(origin.x, origin.y, 
             Double.parseDouble(bottomright.getAttribute("x"))-position.getX(), 
             Double.parseDouble(bottomright.getAttribute("y"))-position.getY());
  }

Spot the differences! This code is so similar to that of the Line constructor that I don't think it requires further explanation.

Creating a Circle Element

Here's the code for the Circle constructor:

    // Circle has radius, angle attributes. Content is <color>, <position>
    public Circle(org.w3c.dom.Element xmlElement) {
      super(xmlElement);
      
      double radius = Double.parseDouble(xmlElement.getAttribute("radius")); 
      circle = new Ellipse2D.Double(origin.x, origin.y,     // Position - top-left
                                    2.*radius, 2.*radius ); // Width & height
    }

Compared to the previous two constructors the only change is the last bit where we use the radius attribute value to define the Ellipse2D.Double object representing the circle.

Creating a Curve Element

Before you nod off, this one's a little more challenging as there may be an arbitrary number of child nodes:

    // Curve has angle attribute. Content is <color>, <position>, <point>+
    public Curve(org.w3c.dom.Element xmlElement) {
      super(xmlElement);

      curve = new GeneralPath();
      curve.moveTo(origin.x, origin.y);
      org.w3c.dom.NodeList nodes = xmlElement.getElementsByTagName("point");
      for(int i = 0 ; i<nodes.getLength() ; i++)
        curve.lineTo(
                     (float)(Double.parseDouble(
           ((org.w3c.dom.Element)nodes.item(i)).getAttribute("x")) - position.x),
                     (float)(Double.parseDouble(
           ((org.w3c.dom.Element)nodes.item(i)).getAttribute("y")) - position.y));
    }

Having said that, the first part calls the base class constructor the same as ever. It's more interesting when we get the list of Element nodes with the name "point" by calling getElementsByTagName() for the xmlElement object. These are the nodes holding the coordinates of the points that define the curve. It is important to us here that the method returns the nodes in the NodeList object in the sequence in which they were originally added to the XML document. If it didn't, we would have no way to reconstruct the curve. With the data encapsulated in the nodes from the NodeList object that is returned, we can reconstruct the GeneralPath object that describes the curve. The first point on the curve is always the origin, so the first definition in the path is defined by calling its moveTo() method to move to the origin.

Each of the <point> nodes contains a point on the path in absolute coordinates. Since we want the curve to be defined relative to the origin, we subtract the coordinates of the start point, position, from the corresponding coordinates stored in each node. We use the resulting coordinates to define the end on each line segment by passing them to the lineTo() method for the path object.

Creating a Text Element

Recreating an Element.Text object from a <text> element is the messiest of all. It certainly involves the most code. It's not difficult though. There are just a lot of bits and pieces to take care of.

    // Text has angle attribute. Content is <color>, <position>, <font>, <string>
    // <font> has attributes fontname, fontstyle, pointsize
    // fontstyle is "plain", "bold", "italic", or "bold-italic"
    // <string> content is text plus <bounds>
    public Text(org.w3c.dom.Element xmlElement) {
      super(xmlElement);

      // Get the font details
      org.w3c.dom.NodeList list = xmlElement.getElementsByTagName("font");
      org.w3c.dom.Element fontElement = (org.w3c.dom.Element)list.item(0);
      String styleStr = fontElement.getAttribute("fontstyle");
      int style = 0;
      if(styleStr.equals("plain"))
        style = Font.PLAIN;
      else if(styleStr.equals("bold"))
        style = Font.BOLD;
      else if(styleStr.equals("italic"))
        style = Font.ITALIC;
      else if(styleStr.equals("bold-italic"))
        style = Font.BOLD + Font.ITALIC;
      else
        assert false;
 
      font = new Font(fontElement.getAttribute("fontname"), style,
                      Integer.parseInt(fontElement.getAttribute("pointsize")));    

      // Get string bounds
      list = xmlElement.getElementsByTagName("bounds");
      org.w3c.dom.Element boundsElement = (org.w3c.dom.Element)list.item(0);
      
      this.bounds = new java.awt.Rectangle(origin.x, origin.y,
                          Integer.parseInt(boundsElement.getAttribute("width")),
                          Integer.parseInt(boundsElement.getAttribute("height")));
      
      // Get the string
      list = xmlElement.getElementsByTagName("string");
      org.w3c.dom.Element string = (org.w3c.dom.Element)list.item(0);
      list = string.getChildNodes();

     StringBuffer textStr = new StringBuffer();
     for(int i = 0 ; i<list.getLength() ; i++)
        if(list.item(i).getNodeType()==org.w3c.dom.Node.TEXT_NODE)
          textStr.append(((org.w3c.dom.Text)list.item(i)).getData());

     text = textStr.toString().trim();
   }

The attributes of the <string> element define the font. Only the fontstyle attribute needs some analysis since we have to represent the style by an integer constant. This means testing for the possible values for the string and setting the appropriate integer value in style. Of course, we could have stored the style as a numeric value, but that would have been meaningless to a human reader. Making the attribute value a descriptor string makes it completely clear. After obtaining the style code we have an assertion to make sure it actually happened. Of course, since the XML was created by Sketcher, the only reasons why this would assert is if there is an error in the code somewhere, or the XML was written by a different version of Sketcher, or the XML was generated by hand. Of course, errors in the XML due to inconsistencies with the DTD would be caught by the parser and signaled by one or other of our ErrorHandler methods.

Text content for an element can appear distributed among several child <Text> nodes. We accommodate this possibilty by concatenating the data from all the child <Text> nodes that we find, and then trimming any leading and trailing white space from the string before storing it in text.

That's all the code we need. You now need to make sure all the imports are in place for the source files we have modified. In addition to what was already there, the Element.java file needs:

import org.w3c.dom.Document;
import org.w3c.dom.Attr;

The SketchFrame.java file needs the following import statements added to the original set:

import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;

You should also have added the following imports to the SketchModel.java file:

import javax.swing.JOptionPane;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.DOMImplementation;
import org.w3c.dom.DOMException;

If everything compiles, you are ready to try exporting and importing sketches.

Try It Out– Sketches in XML

You can try various combinations of elements to see how they look in XML. Be sure to copy the sketcher.dtd file to the directory in which you are storing exported sketches. If you don't, you won't be able to import them since the DTD will not be found. Don't forget you can look at the XML using any text editor and in most browsers. I created the sketch below.

When I exported this I got an XML file with the contents:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE sketch SYSTEM "sketcher.dtd">
<sketch>
 <circle radius="15.0" angle="0.0">
  <color B="255" G="0" R="0"/>
  <position x="153.0" y="109.0"/>
 </circle>
 <circle radius="18.027756377319946" angle="0.0">
  <color B="255" G="0" R="0"/>
  <position x="217.0" y="123.0"/>
 </circle>
 <circle radius="134.61797799699713" angle="0.0">
  <color B="255" G="0" R="0"/>
  <position x="78.0" y="48.0"/>
 </circle>
 <line angle="0.0">
  <color B="0" G="0" R="255"/>
  <position x="191.0" y="158.0"/>
  <endpoint x="162.0" y="194.0"/>
 </line>
 <line angle="0.0">
  <color B="0" G="0" R="255"/>
  <position x="162.0" y="193.0"/>
  <endpoint x="197.0" y="198.0"/>
 </line>
 <line angle="0.0">
  <color B="0" G="255" R="0"/>
  <position x="185.0" y="94.0"/>
  <endpoint x="137.0" y="104.0"/>
 </line>
 <line angle="0.0">
  <color B="0" G="255" R="0"/>
  <position x="246.0" y="110.0"/>
  <endpoint x="285.0" y="166.0"/>
 </line>
 <curve angle="0.0">
  <color B="0" G="0" R="255"/>
  <position x="132.0" y="224.0"/>
  <point x="132.0" y="223.0"/>
  <point x="133.0" y="222.0"/>
  <point x="134.0" y="222.0"/>
  <point x="137.0" y="222.0"/>
  <!-- points cut here for the sake of brevity -->
  <point x="211.0" y="245.0"/>
  <point x="212.0" y="245.0"/>
  <point x="213.0" y="245.0"/>
  <point x="214.0" y="245.0"/>
 </curve>
 <text angle="0.3183694064160789">
  <color B="0" G="255" R="0"/>
  <position x="42.0" y="283.0"/>
  <font fontname="Comic Sans MS" fontstyle="bold-italic" pointsize="18"/>
  <string>
   <bounds width="271" height="21"/>
   The Complete Set! &quot;Try it out&quot;
  </string>
 </text>
</sketch>

This file is also available as sketchexample.xml in the code download for this book from the Wrox Press web site, http://www.wrox.com. You could try importing it into Sketcher and see if you get the same sketch.