XSL – the basics
This document explains what XSL is and how it could be used. It only covers the basics of XSL but with these basics it will be shown that wonderful things can be done. The XML and XSL examples used in this document can be found at the end of the document.
XSL stands for eXtensible Stylesheet Language. The World Wide Web Consortium defined the XSLT language, and version 1.0 was published as a Recommendation at the end of 1999. XSL is used to transform an XML document into a simple text document, a HTML document or a new XML document.
XSL consists of three parts:
XSLT – a language used for transferring the structure of an XML document.
XPath – a language used to extract data from an XML document.
XSL-FO– a language used for formatting an XML documents.
This document only covers XSLT and XPath.
Why transform XML?
XML is a simple standard with which data can be exchanged between computer programs. Part of XML’s success is that it can be read and written by humans, using nothing more than a simple text editor, but its primary use is for the communication between software systems. The main point is that whether it is used by humans (e.g. HTML or PDF) or by a software application (e.g comma-separated file or a new XML file), it will normaly not be used in the form it arrives in, it first needs to be transferred into something else.
Elements used in an XML document are not predefined and therefore have no real meaning.
For example the <table>
element used in a HTML document has a predefined meaning and the browser knows what to do with it. If however the <table>
element is used in an XML document, it could mean a HTML table or it could be a piece of furniture. This is where XSL can help by describing how the XML document should be interpreted.
The XSLT processor
The XSLT language is used to instruct the XSLT processor as to how it will create a desired output (text, HTML or XML) from a given XML input.
When you start the XSLT processor and apply a stylesheet to an XML document (source document), the first thing it does is read and parses the XSL and XML documents and thereby create an internal tree representation of the documents in memory. When the XSL processor reads in the stylesheet, it constructs a tree from it in just the same way as it does with a source XML document. A tree is a data structure composed of connected nodes beginning with a top node called the root. The root is connected to its child nodes, each of which is connected to zero or more children of its own, and so forth. Note that the root node is not the outermost element. In Xpath and XSLT, the root node is the parent of the outermost element and represents the document as a whole. A tree data structure contains different types of node types of which the following are the most common:
Root node – The start point of the XML document.
Element node – an XML element name.
Text node – the text found in a XML element.
Attribute node – an XML element attribute value.
What does XPath do?
SQL and XSLT are different languages but if you look closer you will find some similarities. In order to process specific data (database or XML) the processing language must use declarative query syntax for selecting the data that needs to be processed. In SQL it’s the “SELECT” statement and in XSLT the equivalent is the XPath expression. XPath is used to navigate through elements and attributes in an XML document and used to retrieve nodes from an XML document, based on a path through the XML document. It allows access to specific nodes, while preserving the hierarchy and structure of the document.
XPath example:-
<xsl:value-of select="sum(//book/@price)"/>
This gets the total of the price attributes on all the <book> elements.
The XSLT document
Because XSL is part of the XML standard we always start the XSLT document with <?xml version=”1.0″?>
The next important requirement is to add the XSL root element that declares the document to be an XSL style sheet:- <xsl:stylesheet> OR <xsl:transform>
They are both completely synonymous and either can be used.
The correct way to declare an XSL style sheet root element, according to the W3C XSLT recommendation is:-
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
The next (optional) element, <output> found in the XSL document defines the output format. There are three possible formats: XML, HTML (default) and text e.g.<xsl:output method=�XML� />
The next element one most likely finds is the <template> elements, which basically is a set of rules. The template element uses rules that it applies when a specified node is matched. The match attribute of the template element contains a pattern (Xpath) and this pattern determines which nodes in the source tree the template matches.
When you start the XSLT processor and apply a stylesheet to an XML input document, the first thing it does is read and parses the documents and creates an internal tree representation of the documents in memory. It then walks through the XML document tree, looking at each node in turn. As each node in the XML document is read, the processor compares it with the pattern of each template rule (match) in the style sheet. When the processor finds a node that matches a template rule pattern, it outputs the rules of the template. A template may contain both text that will appear literally in the output document and XSLT instructions that copy data from the input XML document to the result document.
By including xsl:apply-templates in the output template, you tell the formatter to compare each child element of the matched source element against the templates in the style sheet, and, if a match is found, output the template for the matched node. The template for the matched node may itself contain xsl:apply-templates elements to search for matches for its children.
For example, the following XSLT style sheet uses the xsl:apply templates element to process the child nodes.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>
<xsl:template match="ORDER">
<body>
<xsl:apply-templates/>
</body>
</xsl:template>
<xsl:template match="CD">
A cd
</xsl:template>
</xsl:stylesheet>
When this style sheet is applied the following happens:
- Processing begins at the root node “
<xsl:template match="/">
� which is one level above the first element in our xml document. You could have used “<xsl:template match="/ORDERS">
� instead and this would have placed the processing starting point at the first element of the XML document. - Then the
<html>
tag is written. - The “
<xsl:apply-templates/>
� element causes the formatting engine to process all the child nodes of the current node ( in our case it’s the root node ) of the input document. - The first child of the root, which in our case is the
<ORDERS>
element in the XML document, is compared with the template rules. It doesn’t match any of them, so no output is generated. - The second child of the root node of the input document, the root element ORDER, is compared with the template rules. It matches the second template rule.
- The
<body>
tag is written out. - The xsl:apply-templates element in the body element causes the formatting engine to process the child nodes of ORDER.
- The first child of the ORDER element, that is the CD element, is compared with the template rules. It matches the third template rule.
- The text “A cd” is output.
- The
</body>
tag is written out. - The
</html>
tag is written out. - Processing is complete.
The end result:
<html>
<body>
A cd
A cd
</body>
</html>
The match attribute of the xsl:template element supports the syntax of Xpath which allows you to express exactly which nodes you do and do not want to select.
XSLT style sheets generally start with a match rule that applies to the root node. To specify the root node in a rule, you give its match attribute the value “/”. For example: <xsl:template match="/">
The most basic match pattern contains a single element name that matches all elements with that name. For example <xsl:template match="CD">
matches all CD elements.
Matching children with /
You’re not limited to the children of the current node in the match attributes. You can use the / symbol to match specified hierarchies of elements. Used alone, the / symbol refers to the root node. However, you can use it between two names to indicate that the second is the child of the first. For example, CD/TITLE refers to TITLE elements that are children of CD elements. In xsl:template elements, this enables you to match only some of the elements of a given kind. For example, the following template rule marks PRICE elements that are children of CD element. It does nothing to any PRICE elements that are not children of a CD element.
<xsl:template match="CD/PRICE">
Matching descendants with //
Sometimes, especially with an uneven hierarchy, you may find it easier to bypass intermediate nodes and simply select all the elements of a given type, whether they’re immediate children, grandchildren, great-grandchildren etc. The double slash, //
, refers to a descendant element at an arbitrary level. For example, the following template rule applies to all TITLE descendants of ORDER, no matter how deep:
<xsl:template match="ORDER//TITLE">
Matching by ID
You may want to apply a particular style to a particular single element without changing all other elements of that type. The simplest way to do that in XSLT is to attach a style to the element’s ID type attribute. This is done with the id() selector, which contains the ID value in single quotes. For example, this rule matches the element with the ID e47:
<xsl:template match="id('new1')">
Matching attributes with @
The @
sign matches against attributes and selects nodes according to attribute names. Simply prefix the name of the attribute that you want to select with the @
sign. For example, this template rule matches the ‘artist’ attributes, and wraps them in an element. ar <xsl:template match="@artist">
<value><xsl:value-of select="."/></value>
</xsl:template>
However, just adding this rule to the style sheet will not automatically produce value output because attributes are not children of the elements that contain them. Therefore by default when an XSLT processor is walking through the tree it does not see attribute nodes. You have to explicitly process them using xsl:apply-templates with an appropriate select attribute.
<xsl:apply-templates select="@artist"/>
Using the OR operator
The vertical bar (|) allows a template rule to match multiple patterns. If a node matches one pattern or the other, it will activate the template. For example, this template rule matches both CD and DVD elements:
<xsl:template match="CD | DVD">
<B><xsl:apply-templates/></B>
</xsl:template>
Testing with [ ]
You can also test for details about the nodes that match a pattern using []. You can perform many different tests, here are some examples:
- Test whether an element contains a given child, attribute, or other node
<xsl:template match="CD [ PRICE | QTY ]">
OR
The following template rule matches CD elements with a TITLE child element that has a ‘artist’ attribute:
<xsl:template match="CD [ TITLE/@artist]">
- Test whether the value of an attribute is a certain string
<xsl:template match="CD [ QTY='10']">
- Test whether the value of an element matches a string
<xsl:template match="CD [ QTY='10']">
- Test which position a given node occupies in the hierarchy
<xsl:template match="ORDER / CD [ position()>2 ]">
(position() is one of many functions that can be used in XSLT. Some others are last() or count(CD).)
Once you have matched the element, its time to extract its value. This is done using the xsl:value-of instruction.
The xsl:value-of instruction computes the value of something and copies it into the output document. The select attribute of the xsl:value-of element specifies exactly which something’s value is being computed.
For example, suppose you want to replace the text “A cd” with the name of the element as given by the contents of its TITLE child. You can replace “A cd” with <xsl:value-of select=" TITLE "/>
like this:
<xsl:template match="CD">
<xsl:value-of select=" TITLE "/>
</xsl:template>
The result:
<html>
<body>
Empire Burlesque
Hide your heart
Greatest Hits
Still got the blues
</body>
</html>
The xsl:value-of instruction should only be used in contexts where it is obvious which node’s value is being taken. If there are multiple possible items that could be selected, then only the first one will be chosen. For example, this is a poor rule because a typical ORDER element contains more than one CD:
<xsl:template match="ORDER">
<xsl:value-of select="CD"/>
</xsl:template>
There are two ways of processing multiple elements in turn.
1. The first method simply use xsl:apply-templates with a select attribute that chooses the particular elements that you want to include e.g.
<xsl:template match="ORDER/CD
<xsl:apply-templates select="QTY
</xsl:template>
<xsl:template match="QTY
<xsl:value-of select="."/>
</xsl:template>
The select="."
in the second template tells the formatter to take the value of the matched element, QTY this example.
2. The second option is xsl:for-each The xsl:for-each element processes each element chosen by its select attribute in turn. However, no additional template is required e.g.
<xsl:template match="ORDER">
<xsl:for-each select="CD">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
In the for-each loop one could also sort the element group by using the xsl:sort element
So for example
<xsl:for-each select="CD">
<xsl:sort select="TITLE" />
<xsl:value-of select="TITLE" />
This will get the “TITLE” element values in all the “CD” elements and sort them into alphabetical order.
xsl:copy
The xsl:copy element copies the source node into the output tree. The contents of the xsl:copy element are an xsl:template element that can select things to be copied. This is often useful when transforming a document from one document to another.
<xsl:template match="TITLE">
<xsl:copy-of select=".."/>
</xsl:template>
( The select=”..” tells the formatter to take the value of the matched parent element )
Here is an example of a simple XML and XSL document which is used in the article and includes most of the topics covered in this article:
<?xml version=�1.0�?>
<ORDERS>
<ORDER num=�1�>
<CD>
<TITLE artist=�Bob Dylan�>Empire Burlesque</TITLE>
<QTY>1</QTY>
<PRICE>10.90</PRICE>
</CD>
<CD>
<TITLE artist=�Bonnie Tyler�>Hide your heart</TITLE>
<QTY>3</QTY>
<PRICE>9.90</PRICE>
</CD>
<CD>
<TITLE artist=�Dolly Parton�>Greatest Hits</TITLE>
<QTY>2</QTY>
<PRICE>9.90</PRICE>
</CD>
<CD>
<TITLE artist=�Gary Moore�>Still got the blues</TITLE>
<QTY>1</QTY>
<PRICE>10.20</PRICE>
</CD>
</ORDER>
</ORDERS>
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/ORDERS/ORDER">
<xsl:variable name="TOT" select="format-number(sum(CD/PRICE),'$ ###.00')" />
<html>
<body>
<table border="2" cellspacing="0" cellpadding="5">
<tr bgcolor="yellow">
<th>Title</th>
<th>Artist</th>
<th>Price</th>
</tr>
<xsl:apply-templates select="CD" />
<tr>
<td colspan="2" align="right">Total</td>
<td bgcolor="lightgreen"><xsl:value-of select="$TOT" /></td>
</tr>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="CD">
<tr>
<td><xsl:value-of select="TITLE"/></td>
<td><xsl:value-of select="TITLE/ @artist"/></td>
<td align="right"><xsl:value-of select="format-number(PRICE,'$ ###.00')"/></td>
</tr>
</xsl:template>
</xsl:stylesheet>
To test your xsl document you could add the xsl url into the xml document e.g.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<ORDER>………
where “test.xsl� is the saved sxl document, in the above example found in the same directory as the xml document.
OR
You could use the following Oracle procedure to create an output document using an input and XSL document:
CREATE OR REPLACE PROCEDURE XslParser(p_XmlUrl varchar2 := ‘test.xml’,
p_XslUrl varchar2 := ‘test.xsl’,
p_FileUrl varchar2 := ‘test.htm’)
IS
v_XmlParser xmlparser.parser;
v_XslParser xmlparser.parser;
v_XslDoc xmldom.DOMDocument;
v_XmlDoc xmldom.DOMDocument;
v_XslSheet xslprocessor.Stylesheet;
v_XslProc xslprocessor.Processor;
BEGIN
v_XmlParser := xmlparser.newParser;
xmlparser.setBaseDir(v_XmlParser, ‘DIR_INPUT’);
xmlparser.setValidationMode(v_XmlParser, FALSE);
xmlparser.parse(v_XmlParser, p_XmlUrl);
v_XmlDoc := xmlparser.getDocument(v_XmlParser);
xmlparser.freeparser(v_XmlParser);
v_XslParser := xmlparser.newParser;
xmlparser.setBaseDir(v_XslParser, ‘DIR_INPUT’);
xmlparser.parse(v_XslParser, p_XslUrl);
v_XslDoc := xmlparser.getDocument(v_XslParser);
v_XslSheet := xslprocessor.newStylesheet(v_XslDoc, p_XslUrl);
v_XslProc := xslprocessor.newProcessor;
xslprocessor.processXSL(v_XslProc, v_XslSheet, v_XmlDoc, ‘DIR_INPUT’, p_FileUrl);
xmlparser.freeparser(v_XslParser);
EXCEPTION
WHEN OTHERS
THEN
RAISE;
END XslParser;
Where ‘DIR_INPUT’ is an Oracle directory type.
Nice article, it gives a good introduction into xsl.
I can recommend using an xml editor like XMLSpy or Stylus Studio. They provide basic features like syntax coloring, tag completion and dtd or schema validation. But they also offer xsl and xsl-FO transformation and debugging functionality and much more.