The use of XML documents is common practice nowadays and so is XML schema (XSD) to validate XML documents. XML validation is often needed to ensure structure, content and relations. However XSD validation covers only a small part. XSD can describe the basic XML structure (valid elements and order) and basic content validation of a XML node. Schematron can be used to cover the remaining part of XML validation:

  • Advanced structure validation
    e.g. element A should have attribute X or attribute Y, but not both and always one of them
  • Structure depending on content
    e.g. when attribute A of element B has value ‘x’ , then it should have child element C
  • Content validation on multiple nodes
    e.g. sum of all percentage elements should be 100
  • Relations between elements
    e.g. For each employee element with a manager attribute there should be another employee element with an id attribute having the same value as the manager attribute of the context employee element. (meaning manager of employee should exist)


Schematron is an XML document itself, like an XSD document. Each validation rule is defined by a rule element. The rule element has a context attribute to define on which node (or nodes) of your target XML the rule applies. To define this, an X-path expression is used. Say we have a XML document with Department elements and we want to define a rule on each Department element:

<rule context="Department">

A rule element has one or more report and/or assert elements. Both contain a test attribute with the actual validation rule (test). The only difference between the two is that a report element results in an output error when the test results in (boolean) true whereas the error element results in an output error when the test results in (boolean) false.
For the test attribute also an X-path expression is used to define the validation rule.
Let’s say our Department element has two attributes, “name” and “abbr” (abbreviation) and define two rules:

  1. abbr should contain at least two characters
  2. abbr should contain at less characters then name

Defining these rules with Schematron XML results in:

<rule context="Department">
  <report test="string-length(@abbr) < 2">Abbreviation too short</report>
  <assert test="string-length(@abbr) < string-length(@name)">Abbreviation too long</assert>
</rule>

To complete the Schematron document, a rule element is a child element of the pattern element. The pattern element is used to group rules and a provide a name for the group. It’s only for readability and has no further technical meaning.
The schema element is the root element of a Schematron document and contains pattern elements.
The result of our example in a complete Schematron document can be seen in the source below.

<?xml version="1.0" encoding="windows-1252" ?>
<schema xmlns="http://www.ascc.net/xml/schematron" >
  <pattern name="Number of characters in abbr attribute">
    <rule context="Department">
      <report test="string-length(@abbr) < 2">Abbreviation too short</report>
      <assert test="string-length(@abbr) < string-length(@name)">Abbreviation too long</assert>
    </rule>
  </pattern>
</schema>

Below another source example with the rule that the total sum of Percent elements within each Total element should be 100.

<?xml version="1.0" encoding="windows-1252" ?>
<schema xmlns="http://www.ascc.net/xml/schematron" >
     <pattern name="Sum equals 100%.">
          <rule context="Total">
               <assert test="sum(//Percent) = 100">Sum is not 100%.</assert>
          </rule>
     </pattern>
</schema>

Before continuing with a complex example with relations between elements, how do we get this to work?
In fact, that’s quite easy. You only need to be able to do xsl(t) translations!
The beauty about Schematron is that is not a new technology, but just clever usage of xslt translations. No new language is needed and you even don’t need to learn xslt, just basic X-path and XML are sufficient.

The trick is that you have to transform your Schematron XML containing your validation rules into a xslt that contains your validation rules. Then you use this generated xslt for validation of the xml documents by doing xslt translation. And how do you generate this xslt with your rules? Yep, also by an xslt translation, the translation of your Schematron XML (with your validation rules) with a provided Schematron xslt (iso_schematron_skeleton_for_xslt1.xsl or iso_schematron_skeleton_for_xslt2.xsl, downloadable from schematron.com).
So it’s a two step approach. First you translate your Schematron rules XML with the Schematron xslt resulting into a new xslt. This xslt contains your rules. Now you can use your generated xslt with the xml documents to validate by doing a xslt translation. This final translation results into your errors or no output when the validation succeeded.

Schematron two step validation proces

In a production environment most of the time the rules are predefined or do not change (often), so the generated xslt can be stored (or cached).

To show the possibilities of Schematron validation I finalize this blog with the promised complex example with element relation rules.
Let’s start with the target XML, so the XML to be validated. With example data, is easier to understand the Schematron rules.

<?xml version="1.0" encoding="windows-1252" ?>
<Company>
  <Department naam="The Floor" afk="fl">
    <Employees>
      <Employee id="10" manager="15">
        <Name>J. Jansen</Name>
        <Salary>1000</Salary>
      </Employee>
      <Employee id="11" manager="20">
        <Name>P. Klaasen</Name>
        <Salary>1100</Salary>
      </Employee>
    </Employees>
  </Department>
  <Department naam="Managers" afk="man">
    <Employees>
      <Employee id="15" manager="25">
        <Name>M. A. Neger</Name>
        <Salary>1700</Salary>
      </Employee>
      <Employee id="20" manager="25">
        <Name>L.E. Ader</Name>
        <Salary>1500</Salary>
      </Employee>
      <Employee id="25">
        <Name>P.R. Esident</Name>
        <Salary>2500</Salary>
      </Employee>
    </Employees>
  </Department>
</Company>

We want to implement the following business rules:

  • All employees of department “The Floor” should have less salary than any manager (=employee in department “Managers”).
  • An employee may not be the manager of himself.
  • There is only one manager without a manager (only one president).
  • The relation manager and employee is a valid one, so manager of employee should exist. This means that for each employee with a manager attribute there must be a manager with attribute id with the same value.

In Schematron xml these rules will results in:

<?xml version="1.0" encoding="windows-1252" ?>
<schema xmlns="http://www.ascc.net/xml/schematron" >
  <pattern name="All floor emp earn less than managers">
    <rule context="Department[@name='The Floor']/Employees/Employee">
      <report test="Salary &gt; //Department[@name='Managers']/Employees/Employee/Salary">Too much</report>
    </rule>
  </pattern>
  <pattern name="Emp not own manager">
    <rule context="Employee[@manager]">
      <assert test="@manager != @id">Own manager</assert>
    </rule>
  </pattern>
  <pattern name="Only one manager without manager">
    <rule context="Department[@name='Managers']/Employees">
      <assert test="count(Employee[not(@manager)]) = 1">More than one president</assert>
    </rule>
  </pattern>
  <pattern name="Manager relation exists">
    <rule context="Employee[@manager]">
      <assert test="/Company/Department[@name='Managers']/Employees/Employee[@id=current()/@manager]">Not a valid manager</assert>
    </rule>
  </pattern>
</schema>

More information can be found at schematron.com.
An easy step by step tutorial can be found here.