Monday, December 03, 2012

Ruminating on Schematron and XML Validation

Schematron is a rule-based XML validation language. It can be used for making assertions on patterns in XML nodes. For e.g. if a person (element) has "Mr" as prefix (attribute), then gender (child element) should be "male". If the assertion fails, an error message that is supplied by the author of the schema can be displayed.

Thus Schematron is capable of expressing complex constraints that cannot be expressed using XML Schema or DTD. Using XML schemas, you can only put constraints on the document structure and basic datatype validation.  The following sites give good intro about this language.

 http://www.schematron.com/
 http://www.ldodds.com/papers/schematron_xsltuk.html

So how do we apply Schematron schema to validate XML document instances? Given below are the simple steps one should follow:
  1. Create a schematron schema file
  2. Use a meta-stylesheet to convert this schema file into a XSLT stylesheet. ( A meta-stylesheet is a stylesheet which generates other stylesheets)
  3. The above generated XSLT stylesheet can be used as a XML validator against the XML instance document (using a XSLT transformation engine)
  4. The output of the transformation would be a XML document with validation errors.
Since the fundamental technologies used as XML Schema, XSLT and XPath; most of the XML processing APIs of Java/.NET can be used for the same.