Java-XML Binders Compared


Recently I was working on a project in which Java-XML binding was to be used. A J2EE web application had to be built based on user access data. This web application serves as the single point of entrance of an intranet portal to 20-25 different Oracle webforms applications served from 3 different application servers and connecting to five different databases, depending on the database access privileges of the user. This portal seems a rather akward solution, but from the users point of view there is only one web page from which to choose the applications. Connecting to the right application server or database is kept transparent from the user.

The access privileges data could not be stored inside a database because some users may connect only to some of all databases and there is no minimum common denominator database. Further, if that single database holding the user access tables would be down while the other databases are up and running, users would be unable to work on the running databases because the access privileges were not available. A backup database holding the access tables or deploy a small (open source) (in memory) database on the application server? To akward, no experience, who would administer it? We did’t go there.

A viable solution would be to store the access privileges into a file (flat text file, properties file or XML file) and store the file on the application server that serves the web application. Thus we would simply circumvent the problem of backing up databases. Further, because the access privileges data is nested: one user can access more than one database, the access file hold more than one user, an XML file favours the flat file and properties file. The application systems that the user may access is stored inside the database. So, it is perfectly possible that user ‘X’ can access application A, B and C on the production database, and only to application C and D on the test database.

To summarise the process-flow: a user connects to the portal web application where an earlier set cookie will reveil his identity together with the last connected database to the web application, His/her identity will be compared/checked to the access privileges stored in the XML file, after a positive authorisation a connection is made to the database instance and a list of applications accessable to the user is returned into a web page. Finally the user selects the application system to work in. If the user wants to switch to another database, he/she can do so using a login page on the portal site where after the proces of retrieving the authorized applications for that user on the new database repeats.

 

Java-centric or XML-centric?

This brings me to the main part of this blog, the Java-XML binding. The building of the web application is beyond the scope of this blog. If you are interested only in the results and conclusion, jump to Table 1.

Before you start picking the same good old binder you used in your previous project from the shelf, you should ask yourself the following question: Is the application I am about to build Java-centric or is it XML-centric? Having read a fair amount of documentation on several Java-XML-binders (the JiXB docs are most clear on this subject) this seems to be the key in making a consice disission on what binders not to use. The application being build here is an example of an XML-centric application. This application is to be build to handle the contents of the (persistent) user access data in the XML document, not the other way around. An example of an Java-centric application would be a Java-application that already generates output to documents, reports or logging-output serialized as flat files, html pages, paper, etc. Exending such an application to generate also XML output would require a Java-centric approach, because you do not want to rebuild/change your classes to generate the XML, rather the XML output is steered/controled by the existing classes.

I will not elaborate on the theory behind XML-Java binding. Sun has provided excellent documentation explaining all in’s and out’s of XML to Java binding. In short, the XML-centric approach lets you use an XML Schema Definition to map elements and attributes inside a XML instance document to Java objects. If you like you could also create these java classes from scratch yourself, but I think that generally would take longer than create an XSD and run a source code generator on it. Having done that you can read as many XML documents as you want and translate (unmarshal) them to Java instances. Also you can serialize (i.e. write to file) Java instances to XML documents (marshalling). Most Java-centric approaches I have come across do not support, or have only partial support for XSD mapping. Instead, mapping Java to XML and vice versa is established via a proprietary mapping XML file. Examples of this approach are Castor and JiXB.

 

Requirements

After having established the fact that my application is XML-centric, all Java-centric binders out there can be ignored. First, we will be binding the elements inside the XML document to Java objects. Second, the application must be able to query the content (select database(s) privileged where user = ‘X’) inside the XML document. Finally, the application must be able to administer the XML document, that is, changes must be written to the XML file, Simultaneous write-access of the file is not expected (low administration action) because there is one one administrator and the list of users and databases is fairly static. Thus, file locking and concurency is not likely to occur.

 

Comparing the binders

There are several implementations and open source projects that provide in binding the contents in an XML file to Java objects. I had no real experience in a specific implementation. Because it would be too much to try them all, I picked the tree best known binders, i.e. with largest user base: Sun’s JAXB, Apache’s XMLBeans and Castor by Exolab. The user base is measured by performing a Google-search on the binder implementation and counting the total number of hits returned (divided by 1000: hence, kilo-hits). Towards the end of the test I discovered that Oracle’s XDK10g has also implemented binding. They implement JAXB as binder in combination with their xmlparserv2. This binder is not included in the test. An article describing binding using the XDK is found here.

Tests are performed using JDeveloper 9.0.5 (build 1618) as IDE with JVM 1.4.2_05 (Sun) on SuSE 9.1 Linux. During the test I found out that JDeveloper 9.0.4 (shipped with Oracle Developer Suite 10g) with JVM 1.4.2_03 on WindowsXP gave errors (NullPointerException) while starting the JVM. I tested the ease of generating source code from an XML Schema Definition file (Figure 2 or 3 for testing support of namespaces) and I tested the ease of invoking the generated classes/interfaces into own code, that is, reading (query) an XML document (Figure 1); appending new gebruikers and databases to the XML document and writing the result to an XML file. Use of namespaces in de schema definition file is conveniant because it will allow the same element names as Java classes to be used. For instance element
gebruiker will map to Gebruiker. If namespaces can not be used, an alternative class name must be chosen: element gebruiker vs. GebruikerType.

Table 1 summarises the binders I have compared. I have added some good and less good features and my impression (1=low, 5=high). Table 2 summarises my remarks. In my oppinion, XMLBeans is favourite because of its rich and easy to understand API and the fact that I got it working immediately, without having to edit configuration files. JAXB is my least favorite because of its vast size (5.3Mb of jar files) and the amount of generated classes which has a bad impact on the performance. I think I would prefer Castor whenever I have to serialize existing Java instances into XML.

 

Table 1: Binder implementations compared

Name Manufacturer User base
[kHits]
pro’s con’s screen-shot of generated sources my impression
(1 – 5)
JAXB,
part of Java Web Services Developers Pack 1.4
Sun 158
  • no 3rd party tools
  • abillity to jar the generated classes during schema generation
  • package is very large: 8 jar files (5.3Mb)
  • generates an awfull lot of class files which is reflecting on the performance.
  • requires WebServices and JAXP packages
  • partial support for XSD specification
  • namespaces in XSD seems not supported
  • low level programming – steap learning curve
  • requires JVM 1.4 up
snapshot-jaxb

1

XML Beans.1.0.3 Apache.org 116
  • small package: 1 jar (1.8Mb)
  • very short learning curve
  • well documented
  • full support of XSD specification and namespaces
  • got it working immediately
  • abillity to jar the generated classes during schema generation
  • abillity to generate binding classes from WSDL
  • supports Xquery and Xpath to query the XML document
  • mapping only via XML Schema Definition (XSD) file
  • requires JVM 1.4 up
  • uses depricated API’s
snapshot XMLBeans

5

Castor 0.9.5.4-xml exolab.org 1920
  • XSD for binding is optional
  • direct mapping w/o binding file.
  • well documented
  • short learning curve
  • small package: 2 jars (1.4 Mb)
  • requires JVM 1.1 up
  • use of other 3rd party packages for parsing (XercesImpl.jar)
  • mapping requires Java object to implement Serializable
  • namespaces in XSD supported requires editing of the castorbuilder.properties file.
snapshot castor

4

XDK 10g Oracle 21 not tested.
  • requires SDK 1.4 up
  • uses JAXB for binding
 

 

 

Table 2: Remarks

All binders are able to choose between use of the SAX- or the DOM-parser for unmarshalling and marshalling. Although you should keep in mind that SAX is the prefered or default method as there is no explicit need for a DOM-tree to (un)marshal.

Both JAXB and XMLBeans generate a layer (package) of interfaces to expose the binding objects and a package of classes to implement the interfaces.
Castor uses only classes (database element results in Database + DatabaseDescriptor + DatabaseType + DatabaseTypeDescriptor) and puts them all in one package. No clean separation of functionality left me puzzeled what classes to use.

All binders include a source code generator that can generate class/interface files from a XML Schema definition. These generators are executable scripts for either Windows or UNIX (Linux). In addition Castor can generate sources from a XML document. Of course this document must conform to a document type definition (DTD) conforming Castor’s mapping definition.

XMLBeans gives the most conveniant methods for adding, inserting, getting or removing elements from the XML document through the API, closely followed by the Castor API.

Castor requires editing of the castorbuilder.properties file for supporting mapping of namespaces in XSD to Java packages.

XMLBeans require editing of a file that ends in .xsdconfig to map target namespace to package name.

Castor XML mapping is a way to simplify the binding of java classes to XML document. It allows to transform the data contained in a java object model into/from an XML document. Although it is possible to rely on Castor’s default behavior to marshal and unmarshal Java objects into an XML document, it might be necessary to have more control over this behavior. For example, if a Java object model already exists, Castor XML Mapping can be used as a bridge between the XML document and that Java object model. Castor allows one to specify some of its marshalling/unmarshalling behavior using a mapping file. This file gives explicit information to Castor on how a given XML document and a given set of Java objects relate to each other. A Castor mapping file is a good way to dissociate the changes in the structure of a Java object model from the changes in the corresponding XML document format.

Nice feature of Castor is that you are not constraint to build an (akward) XSD file. Castor lets you serialize almost any class as XML file, ContentHandler or Node. Only requirement is that the Java class implements the java.io.Serializable interface.

XMLBeans provides intuitive ways to handle XML that make it easier for you to access and manipulate XML data and documents in Java.
Characteristics of XMLBeans approach to XML:

  • It provides a familiar Java object-based view of XML data without losing access to the original, native XML structure.

  • The XML’s integrity as a document is not lost with XMLBeans. XML-oriented APIs commonly take the XML apart in order to bind to its parts. With XMLBeans, the entire XML instance document is handled as a whole. The XML data is stored in memory as XML. This means that the document order is preserved as well as the original element content with whitespace.

  • With types generated from schema, access to XML instances is through JavaBean-like accessors, with get and set methods.

  • It is designed with XML schema in mind from the beginning — XMLBeans supports all XML schema definitions.

  • Access to XML is fast.

The starting point for XMLBeans is XML schema. An XML schema can enforce control over how data is ordered in a document, or constraints on particular values (for example, a birth date that must be later than 1900). Unfortunately, the ability to enforce rules like this is typically not available in Java without writing custom code. XMLBeans honors schema constraints.

Previous options for handling XML include using XML programming interfaces (such as DOM or SAX) or an XML marshalling/binding tool (such as JAXB). Because it lacks strong schema-oriented typing, navigation in a DOM-oriented model is more tedious and requires an
understanding of the complete object model. JAXB provides support for the XML schema specification, but handles only a subset of it; XMLBeans supports all of it. Also, by storing the data in memory as XML, XMLBeans is able to reduce the overhead of marshalling and demarshalling.

With XMLBeans you can use XPath and XQuery to query XML for specific pieces of data. XQuery is sometimes referred to as "SQL for XML" because it provides a mechanism to access data directly from XML documents, much as SQL provides a mechanism for accessing data in traditional databases.

This means that the larger the XML documents become, the more benefit you will have using XQuery and XPath as query mechanims over looping over the whole data set while testing for a specific expression. In order to get all XPath expressions working you must add xbean_xpath.jar and jaxen-1.1-beta-2.jar to your classpath, as xbean.jar has only limited XPath capabillities.

With XMLBeans you can use XML cursors. In addition to providing a way to execute XQuery expression, an XML cursors offers a fine-grained model for manipulating data. The XML cursor API, analogous to the DOM’s object API, is simply a way to point at a particular piece of data. So, just like an SQL-cursor helps navigate through a set of data, the XML cursor defines a location in XML where you can perform actions on the selected XML. This document discusses using XmlCursor to navigate a document.

 

Figure 1: user access for 3 users, file toegang.xml

<?xml version='1.0' encoding='UTF-8'?>
<toegang xmlns="http://admin.eurmenu.evm.nl">
<gebruiker>
<naam>scott</naam>
<database>
<dbnaam>ontw1</dbnaam>
<dbnaam>proda</dbnaam>
</database>
</gebruiker>
<gebruiker>
<naam>jones</naam>
<database>
<dbnaam>ontw1</dbnaam>
<dbnaam>test1</dbnaam>
<dbnaam>accep</dbnaam>
</database>
</gebruiker>
<gebruiker>
<naam>koyak</naam>
<database>
<dbnaam>ontw1</dbnaam>
<dbnaam>test1</dbnaam>
</database>
</gebruiker>
</toegang>

 

The namespace is added because in the XSD (figures 2 and 3) the elementFormDefault is set to qualified.

 

Figure 2: XML Schema defining binding for toegang.xml, using
namespaces

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:eur="http://admin.eurmenu.evm.nl"
targetNamespace="http://admin.eurmenu.evm.nl"
elementFormDefault="qualified" >

<xsd:annotation>
<xsd:documentation>XML Schema defining binding.</xsd:documentation>
</xsd:annotation>

<xsd:element name="toegang" type="eur:toegang"/>

<xsd:complexType name="toegang">
<xsd:sequence maxOccurs="unbounded">
<xsd:element name="gebruiker" type="eur:gebruiker"/>
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="gebruiker">
<xsd:sequence>
<xsd:element name="naam" type="xsd:string"/>
<xsd:element name="database" type="eur:database"/>
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="database">
<xsd:sequence maxOccurs="unbounded">
<xsd:element name="dbNaam" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

 

This Schema will generate the following classes/interfaces:

Toegang, Gebruiker, Database and a Document level class/interface.

 

Figure 3: XML Schema defining binding for toegang.xml, without
namespaces

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://admin.eurmenu.evm.nl"
elementFormDefault="qualified" >

<xsd:annotation>
<xsd:documentation>XML Schema defining binding.</xsd:documentation>
</xsd:annotation>

<xsd:element name="toegang" type="toegangType"/>

<xsd:complexType name="toegangType">
<xsd:sequence maxOccurs="unbounded">
<xsd:element name="gebruiker" type="gebruikerType"/>
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="gebruikerType">
<xsd:sequence>
<xsd:element name="naam" type="xsd:string"/>
<xsd:element name="database" type="databaseType"/>
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="databaseType">
<xsd:sequence maxOccurs="unbounded">
<xsd:element name="dbNaam" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

This Schema will generate the following classes/interfaces:

ToegangType, GebruikerType, DatabaseType and a Document-level
class/interface.

 

 

 

 

 

 

 

 

12 Comments

  1. Arif Shaon June 29, 2006
  2. Marco Gralike September 28, 2005
  3. harm April 19, 2005
  4. JRadical April 18, 2005
  5. harm January 8, 2005
  6. Kirill January 8, 2005
  7. John Munsch January 7, 2005
  8. Alex Greif January 7, 2005
  9. Leon van Tegelen January 6, 2005
  10. harm January 6, 2005
  11. Jasper January 6, 2005
  12. Leon van Tegelen January 6, 2005