In any distributed computing protocol, object or application state will frequently be exchanged across communicating nodes. This requires state to be sent across the wire enclosed in the agreed-upon communication protocol. In the case of Web services, it means that within a SOAP message, the XML representing the object state must be represented in a standardized format, so that both parties can understand and interpret that state information. Encoding refers to the rules for serializing and deserializing specific elements in the SOAP message. It is sometimes called "Section 5" encoding, because it is defined in Section 5 of the SOAP specifications.
Because Web services can be implemented in any number of programming languages, the application-defined data structures must be represented as XML on the wire. This is derived from the language bindings for that platform-for example, Java to XML and XML to Java, C++ to XML and XML to C++. However, once data is capable of being represented as XML, the SOAP message can carry that data in the body in two ways:
Based on an agreed upon schema that defines the contents of the XML. This is often referred to as literal encoding. For example, in Listing 4.5, both parties know that the billingaddress contains a city, state, and zip. The city and state are of type string, and the zip is an integer. The types are not specific to any language but refer to the primitive types defined in the XML Schema specification.
Based on a predetermined set of rules defined by some standard schema. This is the part played by SOAP encoding-which, however, is completely optional. Section 5 encoding rules are available essentially as a convenience that allows nodes to exchange information without any prior knowledge about the type of information.
In both cases, the sender and receiver have to use the same serialization format on the wire to correctly process the message. Both parties also have to agree on schemas for
The overall SOAP message
The encoding mechanism used
Headers in use
Application-specific XML documents in the body or attachment. Though not required, this is good practice if the content is XML. For example, to process a purchase order sent as an attachment, the receiver must know what that purchase order looks like.
SOAP defines an encoding scheme. The value of the encodingStyle attribute, a URI, provides the receiving SOAP node a pointer to the rules used for encoding and decoding the data. In Listing 4.4, we saw that the Envelope element had an attribute encodingStyle="http://schemas.xmlsoap.org/soap/encoding/". This encodingStyle is used to encode messages within the SOAP body and SOAP header elements, unless individual elements override this with an encodingStyle attribute of their own.
Although the SOAP specification defines, through a schema, a set of encoding rules that map well to programming constructs, it does not specify any default encoding. This means that if the encodingStyle attribute does not appear in the message or appears with the encodingStyle=""attribute, the receiver cannot make any assumptions about how data will be represented in the message and will have to try to figure out how to deserialize that information on its own. Let us now look at these encoding rules defined by SOAP.
Loosely speaking, a simple type is any XML element that represents a single data unit and is represented as a single element in the body. From the SOAP messages shown earlier, the identifier and date elements in the purchase order are simple data types:
<identifier>87 6784365876JHITRYUE</identifier> <date>29 October 2002</date>
SOAP encoding exposes all the simple types built into the XML Schema specifications (see Appendix A) and provides two alternate syntaxes for expressing instances of these data types, as shown for the <identifier> element in this SOAP message:
<?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <env:Body> <identifier xsi:type="xsd:string">87 6784365876JHITRYUE</identifier> <identifier xsi:type="enc:xsd:string">87 6784365876JHITRYUE</identifier> </env:Body> </env:Envelope>
Of the over three dozen types in the specification, those commonly used are string, integer, byte, short, int, long, decimal, float, double, Boolean, date, and base64Binary (to represent binary content).
It may seem confusing that Java WSDP uses the enc prefix, although other documentation and toolkits use the prefix SOAP-ENC or ENC, and so on. Keep in mind that these are just namespace prefixes that can be anything-the XML parsers will resolve them. What matters is the namespace they point to, which in this case is the SOAP encoding schema at http://schemas.xmlsoap.org/soap/encoding/, which will be the same for any and all toolkits that use this encoding.
Type information can be associated with an element in two different ways:
Using the type information directly with the element, as we just saw:
<identifier xsi:type="xsd:string">87 6784365876JHITRYUE</identifier>
Referencing a schema directly, as in the purchase order in the attachment from Listing 4.5:
<purchaseorder xmlns="http://www.flutebank.com/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.flutebank.com/schema purchaseorder.xsd"> <identifier>87 6784365876JHITRYUE</identifier> </purchaseorder>
This was the cause for earlier issues of interoperabilty between different toolkit implementations. Apache and other Java toolkits expected the former, whereas Microsoft adopted the latter. This has now been resolved though a community effort, such as the SoapBuilders community. We will talk about this and other interoperability issues in Chapter 10.
A compound type represents two or more simple types grouped under a single element. For example, the billing address in Listing 4.5 is a compound type:
<billingaddress> <name>John Malkovich</name> <street>256 Eight Bit Lane</street> <city>Burlington</city> <state>MA</state> <zip>01803</zip> </billingaddress>
A compound data type can be either a struct or an array. A struct is an element that contains disparate child elements. The billing address above is an example of a struct. Compound structs use the same xsi:type attribute to specify type information about individual elements.
An array, on the other hand, is a compound type that contains elements of the same name-for example, a group of email addresses:
<emailaddresses> <email> "mailto:John.Malkovich@flutebank.com </email> <email> "mailto:J.Malkovich@home.com" </email> </emailaddresses>
Arrays are encoded as elements of type enc:Array and take an additional attribute, enc:arrayType, to describe their content. This declaration takes the form type[size], which is similar to the way arrays are declared in Java. Let us look at an example of a SOAP message that encodes the above array:
<?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:enc=http://schemas.xmlsoap.org/soap/encoding/ xmlns:ns0="http://www.flutebank.com/xml" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <env:Body> <ns0:emailaddresses xsi:type="enc:Array" enc:arrayType="xsd:String[2]" > <email xsi:type="xsd:string"> John.Malkovich@flutebank.com </email> <email xsi:type="xsd:string"> J.Malkovich@home.com </email> </ns0:emailaddresses> </soap-env:Body> </soap-env:Envelope>
An array is not limited to simple types and can contain other compound types. The SOAP-encoded message below shows an array of two addresses. Note that the compound type must be specified in the schema associated with its namespace prefix (ns0 in the example below):
<?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:enc=http://schemas.xmlsoap.org/soap/encoding/ xmlns:ns0="http://www.flutebank.com/xml" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <env:Body> <ns0:addresses xsi:type="enc:Array" enc:arrayType="ns0:BillingAddresses[2]" > <billingaddress> <name xsi:type="xsd:string"> John Malkovich</name> <street xsi:type="xsd:string">256 Eight Bit Lane</street> <city xsi:type="xsd:string">Burlington</city> <state xsi:type="xsd:string">MA</state> <zip xsi:type="xsd:string">01803</zip> <billingaddress> <billingaddress> <name xsi:type="xsd:string"> John Malkovich</name> <street xsi:type="xsd:string">256 64 Bit Street</street> <city xsi:type="xsd:string">Unix Town</city> <state xsi:type="xsd:string">MA</state> <zip xsi:type="xsd:string">01803</zip> <billingaddress> </ns0:addresses> </soap-env:Body> </soap-env:Envelope>
To support multidimensional arrays, SOAP uses references, which are analogous to local anchors in an HTML page. A reference is specified with the href attribute, which points to an element identified with an id attribute.
Let us look at this further with an example. The SOAP message below shows how a service returns an array of compound types. The message is in response to an invocation of a Java method with the signature public PaymentDetail[] listScheduledPayments();
<?xml version="1.0" encoding="UTF-8"?> <env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns0="http://www.flutebank.com/xml" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <env:Body> <ns0:listScheduledPaymentsResponse> <result href="#ID1"/> </ns0:listScheduledPaymentsResponse> <ns0:ArrayOfPaymentDetail id="ID1" xsi:type="enc:Array" enc:arrayType="ns0:PaymentDetail[2]"> <item href="#ID2"/> <item href="#ID3"/> </ns0:ArrayOfPaymentDetail> <ns0:PaymentDetail id="ID2" xsi:type="ns0:PaymentDetail"> <date xsi:type="xsd:dateTime">2002-10-05T00:12:18.269Z</date> <account xsi:type="xsd:string">Credit</account> <payeeName xsi:type="xsd:string">Digital Credit Union</payeeName> <amt xsi:type="xsd:double">2000.0</amt> </ns0:PaymentDetail> <ns0:PaymentDetail id="ID3" xsi:type="ns0:PaymentDetail"> <date xsi:type="xsd:dateTime">2002-10-05T00:12:18.269Z</date> <account xsi:type="xsd:string">Credit</account> <payeeName xsi:type="xsd:string">AAA Club</payeeName> <amt xsi:type="xsd:double">180.0</amt> </ns0:PaymentDetail> </env:Body> </env:Envelope>
The body contains a compound type listScheduledPaymentsResponse, which refers to an array. The array contains two elements of compound type Payment-Detail, which are referred to as individual items.
In all the above examples, we have specified the encodingStyle attribute for the entire envelope. Sometimes it may be desirable to have different encoding schemes defined within a single message. SOAP accomodates this requirement by providing the ability to specify an encodingStyle at the element level. The general rule is that child elements inherit the encoding style of the parent elements, unless they override the parent's encoding style with one of their own.