In this chapter we briefly visit the intersection between JavaScript and the eXtensible Markup Language (XML). XML has quickly risen to be a favored method of structured data interchange on the Web. Today many sites exchange XML data feeds or store site content in XML files for later transformation into the appropriate presentation medium for site visitors from XHTML to WML (Wireless Markup Language) and beyond. So far, client-side use of XML has been relatively rare except in the form of specialized languages built with XML such as XHTML, SVG, RSS, and others. Using JavaScript to manipulate XML client-side is rarer still, at least on public Web sites. Much of this chapter presents examples of XML and JavaScript that are often proprietary, probably bound to change, and almost always buggy. In other words, proceed with extreme caution.
Given the lack of knowledge about XML among many developers, we start first with a very brief overview of XML and its use. If you are already very well versed in XML you can skip to the section “The DOM and XML” and dive in using JavaScript with XML; otherwise, read on and find out what all the hype is really all about.
Writing simple XML documents is fairly easy. For example, suppose that you have a compelling need to define a document with markup elements to represent a fast-food restaurant’s combination meals, which contain a burger, drink, and fries. You might do this because this information will be sent to your suppliers, you might expect to receive electronic orders from customers via e-mail this way, or it might just be a convenient way to store your restaurant’s data. Regardless of the reason, the question is how you can do this in XML. You would simply create a file such as burger.xml that contains the following markup:
<<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>> <<combomeal>> <<burger>> <<name>>Tasty Burger<</name>> <<bun bread="white">> <<meat />> <<cheese />> <<meat />> <</bun>> <</burger>> <<fries size="large" />> <<drink size="large">> Cola <</drink>> <</combomeal>>
A rendering of this example under Internet Explorer is shown in Figure 20-1.
Notice that the browser shows a structural representation of the markup, not a screen representation. You’ll see how to make this file actually look like something later in the chapter. First, take a look at the document syntax. In many ways, this example “Combo Meal Markup Language” (or CMML, if you like) looks similar to HTML—but how do you know to name the element <<combomeal>> instead of <<mealdeal>> or <<lunchspecial>>? You don’t need to know, because the decision is completely up to you. Simply choose any element and attribute names that meaningfully represent the domain that you want to model. Does this mean that XML has no rules? It has rules, but they are few, simple, and relate only to syntax:
The document must start with the appropriate XML declaration, like so:
<<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>>
or more simply just:
<<?xml version="1.0" ?>>
A root element must enclose the entire document. For example, in the previous example notice how the <<combomeal>> element encloses all other elements. In fact, not only must a root element enclose all other elements, the internal elements should close properly.
All elements must be closed. The following
<<burger>>Tasty
is not allowed under XML, but
<<burger>>Tasty<</burger>>
would be allowed. Even when elements do not contain content they must be closed properly, as discussed in the next rule, for a valid XML document.
All elements with empty content must be self-identifying, by ending in />> just like XHTML. An empty element is one such as the HTML <<br>>, <<hr>>, or <<img src="test.gif">> tags. In XML and XHTML, these would be represented, respectively, as <<br />>, <<hr />>, and <<img src="test.gif" />>.
Just like well-written HTML and XHTML, all elements must be properly nested.
For example,
<<outer>><<inner>>ground zero<</inner>><</outer>>
is correct, whereas this isn’t:
<<outer>><<inner>>ground zero<</outer>><</inner>>
All attribute values must be quoted. In traditional HTML, quoting is good authoring practice, but it is required only for values that contain characters other than letters (A–Z, a–z), numbers (0–9), hyphens (-), or periods (.). Under XHTML, quoting is required as it is in XML. For example,
<<blastoff count="10" >><</blastoff>>
is correct, whereas this isn’t:
<<blastoff count=10>><</blastoff>>
All elements must be cased consistently. If you start a new element such as <<BURGER>>, you must close it as <</BURGER>>, not <</burger>>. Later in the document, if the element is in lowercase, you actually are referring to a new element known as <<burger>>. Attribute names also are case-sensitive.
A valid XML file may not contain certain characters that have reserved meanings. These include characters such as &, which indicates the beginning of a character entity such as &, or << , which indicates the start of an element name such as <<sunny>>.
These characters must be coded as & and <, respectively, or can occur in a section marked off as character data. In fact, under a basic stand-alone XML document, this rule is quite restrictive as only &, <, >, ', and " would be allowed.
A document constructed according to the previous simple rules is known as a well-formed document. Take a look in Figure 20-2 at what happens to a document that doesn’t follow the well-formed rules presented here.
Markup purists might find the notion of well-formed-ness somewhat troubling. Traditional SGML has no notion of well-formed documents; instead it uses the notion of valid documents—documents that adhere to a formally defined document type definition (DTD). For anything beyond casual applications, defining a DTD and validating documents against that definition are real benefits. XML supports both well-formed and valid documents. The well-formed model that just enforces the basic syntax should encourage those not schooled in the intricacies of language design and syntax to begin authoring XML documents, thus making XML as accessible as traditional HTML has been. However, the valid model is available for applications in which a document’s logical structure needs to be verified. This can be very important when we want to bring meaning to a document.
A document that conforms to its specified grammar is said to be valid. Unlike many HTML document authors, SGML and XML document authors normally concern themselves with producing valid documents. With the rise of XML, Web developers can look forward to mastering a new skill writing language grammars. In the case of XML, we can write our language grammar either in the form of document type definition (DTD) or as a schema.
For simplicity, we define a DTD for the previously used combo meal example. The definition for the example language can be inserted directly into the document, although this definition can be kept outside the file as well. The burger2.xml file shown here includes both the DTD and an occurrence of a document that conforms to the language in the same document:
<<?xml version="1.0"?>> <<!DOCTYPE combomeal [ <<!ENTITY cola "Pepsi">> <<!ELEMENT combomeal (burger+, fries+, drink+)>> <<!ELEMENT burger (name, bun)>> <<!ELEMENT name (#PCDATA)>> <<!ELEMENT bun (meat+, cheese+, meat+)>> <<!ATTLIST bun bread (white | wheat) #REQUIRED >> <<!ELEMENT meat EMPTY>> <<!ELEMENT cheese EMPTY>> <<!ELEMENT fries EMPTY>> <<!ATTLIST fries size (small | medium | large) #REQUIRED >> <<!ELEMENT drink (#PCDATA)>> <<!ATTLIST drink size (small | medium | large) #REQUIRED >> ]>> <<!-- the document instance -->> <<combomeal>> <<burger>> <<name>>Tasty Burger<</name>> <<bun bread="white">> <<meat />> <<cheese />> <<meat />> <</bun>> <</burger>> <<fries size="large" />> <<drink size="large">> &cola; <</drink>> <</combomeal>>
We could easily have just written the document itself and put the DTD in an external file, referencing it using a statement such as
<<!DOCTYPE combomeal SYSTEM "combomeal.dtd">>
at the top of the document and the various element, attribute, and entity definitions in the external file combomeal.dtd. Regardless of how it is defined and included, the meaning of the defined language is relatively straightforward. A document is enclosed by the <<combomeal>> tag, which in turn contains one or more <<burger>>, <<fries>>, and <<drink>> tags. Each <<burger>> tag contains a <<name>> and <<bun>>, which in turn contain <<meat />> and <<cheese />> tags. Attributes are defined to indicate the bread type of the bun as well as the size of the fries and drink in the meal. We even define our own custom entity &cola; to make it easy to specify and change the type of cola, in this case Pepsi, used in the document.
One interesting aspect of using a DTD with an XML file is that the correctness of the document can be checked. For example, adding non-defined elements or messing up the nesting orders of elements should cause a validating XML parser to reject the document, as shown in Figure 20-3.
Note |
At the time of this writing, most browser-based XML parsers, particularly Internet Explorer’s, don’t necessarily validate the document, but just check to make sure the document is well formed. The Internet Explorer browser snapshot was performed using an extension that validates XML documents. |
Writing a grammar in either a DTD or schema form might seem like an awful lot of trouble, but without one, the value of XML is limited. If you can guarantee conformance to the specification, you can start to allow automated parsing and exchange of documents. Writing a grammar is going to be a new experience for most Web developers, and not everybody will want to write one. Fortunately, although not apparent from the DTD rules in this brief example, XML significantly reduces the complexity of full SGML. However, regardless of how easy or hard it is to write a language definition, readers might wonder how to present an XML document once it is written.
Notice that inherently, XML documents have no predefined presentation; thus we must define one. While this may seem like a hassle, it actually is a blessing as it forces the separation of content structure from presentation. Already, many Web developers have embraced the idea of storing Web content in XML format and then transforming it into an appropriate output format such as HTML or XHTML and CSS using eXtensible Style Sheet Transformations (XSLT), which is part of the eXtensible Style Sheets (XSL) specification or some form of server-side programming. It is also possible to render XML natively in most browsers by binding CSS directly to user-defined elements.
Note |
In many cases, developers simply refer to XSL rather than XSLT when discussing the features provided by the latter. |
With XSLT, you can easily transform and then format an XML document. Various elements and attributes can be matched using XSL, and other markup languages such as HTML or XHTML, and then can be output. Let’s demonstrate this idea using client-side processed XSL found in most modern browsers. Consider the following simple well-formed XML document called demo.xml:
<<?xml version="1.0" ?>> <<?xml-stylesheet type="text/xsl" href="test.xsl"?>> <<example>> <<demo>>Look <</demo>> <<demo>>formatting <</demo>> <<demo>> XML <</demo>> <<demo>>as HTML<</demo>> <</example>>
Notice that the second line applies an XSL file called test.xsl to the document. That file will create a simple HTML document and convert each occurrence of the <<demo>> tag to an <<h1>> tag. The XSL template called test.xsl is shown here:
<<?xml version='1.0'?>> <<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">> <<xsl:template match="/">> <<html>> <<head>> <<title>>XSL Test<</title>> <</head>> <<body>> <<xsl:for-each select="example/demo">> <<h1>><<xsl:value-of select="."/>><</h1>> <</xsl:for-each>> <</body>> <</html>> <</xsl:template>> <</xsl:stylesheet>>
Note |
In order to make the examples in this section work under Internet Explorer 5 or 5.5, use the statement <<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">> to define the XSL version in place of the second line of each XSL document. |
Given the previous example, you could load the main XML document through an XML- and XSL-aware browser such as Internet Explorer. You would then end up with the following markup once the XSL transformation was applied:
<<html>> <<head>> <<title>>XSL Test<</title>> <</head>> <<body>> <<h1>>Look<</h1>> <<h1>>formatting<</h1>> <<h1>>XML<</h1>> <<h1>>as HTML<</h1>> <</body>> <</html>>
The example transformation under Internet Explorer is shown in Figure 20-4.
Whereas the preceding example is rather contrived, it is possible to create a much more sophisticated example. For example, given the following XML document representing an employee directory, you might wish to convert it into a traditional HTML table-based layout:
<<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>> <<directory>> <<employee>> <<name>>Fred Brown<</name>> <<title>>Widget Washer<</title>> <<phone>>(543) 555-1212<</phone>> <<email>>fbrown@democompany.com<</email>> <</employee>> <<employee>> <<name>>Cory Richards<</name>> <<title>>Toxic Waste Manager<</title>> <<phone>>(543) 555-1213<</phone>> <<email>>mrichards@democompany.com<</email>> <</employee>> <<employee>> <<name>>Tim Powell<</name>> <<title>>Big Boss<</title>> <<phone>>(543) 555-2222<</phone>> <<email>>tpowell@democompany.com<</email>> <</employee>> <<employee>> <<name>>Samantha Jones<</name>> <<title>>Sales Executive<</title>> <<phone>>(543) 555-5672<</phone>> <<email>>jones@democompany.com<</email>> <</employee>> <<employee>> <<name>>Eric Roberts<</name>> <<title>>Director of Technology<</title>> <<phone>>(543) 567-3456<</phone>> <<email>>eric@democompany.com<</email>> <</employee>> <<employee>> <<name>>Frank Li<</name>> <<title>>Marketing Manager<</title>> <<phone>>(123) 456-2222<</phone>> <<email>>fli@democompany.com<</email>> <</employee>> <</directory>>
You might consider creating an XHTML table containing each of the individual employee records. For example, an employee represented by
<<employee>> <<name>>Employee's name<</name>> <<title>>Employee's title<</title>> <<phone>>Phone number<</phone>> <<email>>Email address<</email>> <</employee>>
might be converted into a table row (<<tr>>) as in the following:
<<tr>> <<td>>Employee's name<</td>> <<td>>Employee's title<</td>> <<td>>Phone number<</td>> <<td>>Email address<</td>> <</tr>>
You can use an XSL style sheet to perform such a transformation. The following is an example XSL style sheet (staff.xsl):
<<?xml version='1.0'?>> <<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">> <<xsl:template match="/">> <<html>> <<head>> <<title>>Employee Directory<</title>> <</head>> <<body>> <<h1 align="center">>DemoCompany Directory<</h1>> <<hr/>> <<table width="100%">> <<tr>> <<th>>Name<</th>> <<th>>Title<</th>> <<th>>Phone<</th>> <<th>>Email<</th>> <</tr>> <<xsl:for-each select="directory/employee">> <<tr>> <<td>><<xsl:value-of select="name"/>><</td>> <<td>><<xsl:value-of select="title"/>><</td>> <<td>><<xsl:value-of select="phone"/>><</td>> <<td>><<xsl:value-of select="email"/>><</td>> <</tr>> <</xsl:for-each>> <</table>> <</body>> <</html>> <</xsl:template>> <</xsl:stylesheet>>
You can reference the style sheet from the original XML document, adding this line in the original staff.xml file,
<<?xml-stylesheet href="staff.xsl" type="text/xsl"?>>
just below the initial <<?xml?>> declaration. The output of this preceding example together with the generated markup created by the browser client-side is shown in Figure 20-5. If you are worried about browser compatibility, given that not all browsers are aware of XSL, you can just as easily transform this into HTML or, even better, XHTML on the server-side. This is probably a safer way to go for any publicly accessible Web page.
Note |
XSL transformation can create all sorts of more complex documents complete with embedded JavaScript or style sheets. |
The previous discussion only begins to touch on the richness of XSL, which provides complex pattern matching and basic programming facilities. Readers interested in the latest developments in XSL are directed to the W3C Web site (http://www.w3.org/Style/XSL/) as well as Microsoft’s XML site (http://msdn.microsoft.com/xml).
The conversion from XML to (X)HTML seems awkward; it would be preferable to deliver a native XML file and display it. As it turns out, it is also possible in most modern browsers to directly render XML by applying CSS rules immediately to tags. For example, given the following simple XML file, you might apply a set of CSS rules by relating the style sheet using <<?xml-stylesheet href="URL to style sheet" type="text/css"?>>, as shown here:
<<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>> <<?xml-stylesheet href="staff.css" type="text/css"?>> <<directory>> <<employee>> <<name>>Fred Brown<</name>> <<title>>Widget Washer<</title>> <<phone>>(543) 555-1212<</phone>> <<email>>fbrown@democompany.com<</email>> <</employee>> ... <</directory>>
The CSS rules for XML elements are effectively the same as for HTML or XHTML documents, although they do require knowledge of less commonly used properties such as display to create meaningful renderings. The CSS rule for the previously presented XML document is shown here and its output under Internet Explorer is shown in Figure 20-6.
directory {display: block;} employee {display: block; border: solid; } name {display: inline; font-weight: bold; width: 200px;} title {display: inline; font-style: italic; width: 200px;} phone {display: inline; color: red; width: 150px;} email {display: inline; color: blue; width: 100px;}
The lack of flow objects in CSS makes properly displaying this XML document very difficult. To format anything meaningful you may have to go and invent your own line breaks, headings, or other structures. In some sense, CSS relies heavily on XHTML for basic document structure. However, it may be possible instead to simply include such structures from XHTML into your document. The next section explores how you can put XHTML into your XML, and vice versa.
In the previous example, which tried to render an XML document using CSS, it might have been useful to add a heading and use line breaks more liberally. You could go about inventing your <<h1>> and <<br />> tags, but why do so when you have XHTML to serve you? You can use existing XHTML tags easily if you use the xmlns attribute. Consider the following:
<<directory xmlns:html="http://www.w3.org/1999/xhtml">> ... elements and text ... <</directory>>
Now within the directory element you can use XHTML tags freely as long as you prefix them with the namespace moniker html we assigned. For example:
<<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>> <<?xml-stylesheet href="staff.css" type="text/css"?>> <<directory xmlns:html="http://www.w3.org/1999/xhtml">> <<html:h1>>Employee Directory<</html:h1>> <<html:hr />> <<employee>> <<name>>Fred Brown<</name>> <<title>>Widget Washer<</title>> <<phone>>(543) 555-1212<</phone>> <<email>>fbrown@democompany.com<</email>> <</employee>> <<html:br />><<html:br />> ... <</directory>>
In this case, you could even attach CSS rules to our newly used XHTML elements and come up with a much nicer layout.
It should be obvious that namespaces are not just for including XHTML markup into an XML file. This facility allows you to include any type of markup within any XML document you like. Furthermore, making sure to prefix each tag with a namespace moniker is highly important, especially when you consider how many people just might define their own <<employee>> tag!
To demonstrate namespaces, let’s include XML in the form of MathML into an XHTML file. A rendering of the markup in a MathML-aware Mozilla variant browser is shown in Figure 20-7.
<<?xml version="1.0"?>> <<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd">> <<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">> <<head>> <<title>>MathML Demo<</title>> <</head>> <<body>> <<h1 style="text-align:center;">>MathML Below<</h1>> <<hr />> <<math mode="display" xmlns="http://www.w3.org/1998/Math/MathML">> <<mrow>> <<mfrac>> <<mrow>> <<mi>>x<</mi>> <<mo>>+<</mo>> <<msup>> <<mi>>y<</mi>> <<mn>>2<</mn>> <</msup>> <</mrow>> <<mrow>> <<mi>>k<</mi>> <<mo>>+<</mo>> <<mn>>1<</mn>> <</mrow>> <</mfrac>> <</mrow>> <</math>> <<hr />> <</body>> <</html>>
Note |
This example requires the file to be named as .xml or .xhtml to invoke the strict XML parser on XHTML. |
The preceding example should suggest that XHTML may become host to a variety of languages in the future or that it will be hosted in a variety of other XML-based languages. The question then is this: should the XML be within the XHTML/HTML or should the XHTML be inside the XML? While the W3C may lean toward XML hosting XHTML markup given the deployed base of HTML documents, markup authors may be more comfortable with just the opposite.
Because of the common desire, or in many cases need, to embed XML data content into an HTML document, Microsoft introduced a special <<xml>> tag in Internet Explorer 4. The <<xml>> tag is used to create a so-called XML data island that can hold XML to be used within the document. Imagine running a query to a database and fetching more data than needed for the page and putting it in an XML data island. You may then allow the user to retrieve new information from the data island without going back to the server. To include XML in an (X)HTML document, you can use the <<xml>> tag and either enclose the content directly within it, like so,
...HTML content... <<xml id="myIsland">> <<directory>> <<employee>> <<name>>Fred Brown<</name>> <<title>>Widget Washer<</title>> <<phone>>(543) 555-1212<</phone>> <<email>>fbrown@democompany.com<</email>> <</employee>> <</directory>> <</xml>> ...HTML content...
or you can reference an external file by specifying its URL:
<<xml id="myIsland" src="staff.xml">><</xml>>
Once the XML is included in the document, you can then bind the XML to HTML elements. In the example here, we bind XML data to a table. Notice that you must use fully standard table markup to avoid repeating the headings over and over:
<<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">> <<html xmlns="http://www.w3.org/1999/xhtml">> <<head>> <<title>>Employee Directory<</title>> <<meta http-equiv="content-type" content="text/html; charset=utf-8" />> <<body>> <<xml id="myIsland" src="staff.xml">><</xml>> <<h1 align="center">>DemoCompany Directory<</h1>> <<hr/>> <<table width="100%" datasrc="#myIsland">> <<thead>> <<tr>> <<th>>Name<</th>> <<th>>Title<</th>> <<th>>Phone<</th>> <<th>>Email<</th>> <</tr>> <</thead>> <<tbody>> <<tr>> <<td>><<span datafld="name">><</span>><</td>> <<td>><<span datafld="title">><</span>><</td>> <<td>><<span datafld="phone">><</span>><</td>> <<td>><<span datafld="email">><</span>><</td>> <</tr>> <</tbody>> <</table>> <</body>> <</html>>
Note |
This example will not validate nor work in other browsers besides Internet Explorer as the <<xml>> tag is a proprietary tag. |
The output of the example is as expected and is shown in Figure 20-8.
Once you bind data into a document, you can display as we did in the previous example or even use JavaScript and manipulate the contents. Imagine sending the full result of a query to a browser and then allowing the user to sort and page through the data without having to go back to the server. Tying XML together with JavaScript can make this happen and we’ll explore that next.
Note |
The preceding discussion is by no means a complete discussion of XML and related technologies, but just enough for us to have the necessary background to present some use of XML and JavaScript together for those unfamiliar with the basics of XML. Readers looking for more detailed information on XML might consider sites like www.xml101.com and, of course, the W3 XML section (www.w3.org/XML). |