XSLT Variable Scoping Differences Across MSXML Versions

XSLT No Comments »

Subtle differences in variable scoping in XSLTs between MSXML 3.0 and 4.0 can result in XSLT files breaking if you upgrade your version of MSXML. Consider the following XSLT:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml"
                version="1.0"
                encoding="UTF-8"
                indent="yes" />

    <xsl:template match="/">
        <root>
            <elem>
                <xsl:variable name="foo">Value</xsl:variable>
                <xsl:value-of select="$foo" />
            </elem>
            <elem>
                <!-- This refers to the variable defined in
                     the previous sibling elem node -->
                <xsl:value-of select="$foo" />
            </elem>
        </root>
    </xsl:template>
</xsl:stylesheet>

This stylesheet (which does not depend on the input XML) works on MSXML 3.0 but fails on MSXML 4.0 with the error message

A reference to variable or parameter ‘foo’ cannot be resolved. The variable or parameter may not be defined, or it may not be in scope.

Clearly, MSXML 4.0 limits the scope of the foo variable to the first elem node, whereas MSXML 3.0 does not. I suspect MSXML 3.0 scopes a variable to its enclosing template.

These scoping differences cut both ways. Consider this attempt to fix the XSLT:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml"
                version="1.0"
                encoding="UTF-8"
                indent="yes" />

    <xsl:template match="/">
        <root>
            <elem>
                <xsl:variable name="foo">Value</xsl:variable>
                <xsl:value-of select="$foo" />
            </elem>
            <elem>
                <xsl:variable name="foo">Value</xsl:variable>
                <xsl:value-of select="$foo" />
            </elem>
        </root>
    </xsl:template>
</xsl:stylesheet>

This stylesheet works on MSXML 4.0 but fails on MSXML 3.0 with the error message

Variable or parameter ‘foo’ cannot be defined twice within the same template.

If you want the stylesheet to work on both processors, you must push up the variable declaration as follows:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml"
                version="1.0"
                encoding="UTF-8"
                indent="yes" />

    <xsl:template match="/">
        <root>
            <xsl:variable name="foo">Value</xsl:variable>
            <elem>
                <xsl:value-of select="$foo" />
            </elem>
            <elem>
                <xsl:value-of select="$foo" />
            </elem>
        </root>
    </xsl:template>
</xsl:stylesheet>

Be careful. Even the smallest of changes can break your software.

Selecting A Maximum Value Using XPath

XPath, XSLT No Comments »

Let’s say you have an XML file which contains daily stock prices, such as the following:

<Prices>
  <Price>
    <Date>2006-09-01</Date>
    <Open>25.89</Open>
    <High>25.97</High>
    <Low>25.64</Low>
    <Close>25.84</Close>
    <Volume>31594600</Volume>
    <AdjClose>25.84</AdjClose>
  </Price>
  <Price>
    <Date>2006-08-31</Date>
    <Open>25.87</Open>
    <High>25.98</High>
    <Low>25.68</Low>
    <Close>25.70</Close>
    <Volume>26380500</Volume>
    <AdjClose>25.70</AdjClose>
  </Price>
  ...
</Prices>

Excerpt from MSFT.xml generated on 2006-09-05 from Yahoo Finance’s MSFT Historical Prices and YahooCsvToXml.py

Now let’s write an XSLT fragment which displays the Price element with the latest Date:

<xsl:for-each select="/Prices/Price">
  <xsl:sort select="Date" order="descending" />

  <xsl:if test="position() = 1">
    <xsl:copy-of select="." />
  </xsl:if>
</xsl:for-each>

What if you wanted to do this in pure XPath 1.0? Well, normally one would use something akin to Jeni Tennison’s XPath maximum ‘trick’ and write the following XPath expression:

/Prices/Price[not(preceding-sibling::Price/Date > Date or
                  following-sibling::Price/Date > Date)]

This expression reads “Select the Price element that doesn’t have a sibling Price element with a Date greater than this one.” (By the way, you should be careful with this XPath expression and large node sets — it is highly likely it runs in O(n2) time.)

Unfortunately the above expression doesn’t work for dates because XPath’s comparison operators only work on numbers, not strings. I tried writing the equivalent expression using Microsoft’s ms:string-compare XPath extension function but it didn’t work — I believe because it only compares two strings whereas the expression requires a function that compares a node-set to a string and returns a node-set.

As far as I can tell, the only way to perform this selection in pure XPath 1.0 is to change the original XML by converting the Date values to numbers (by removing the dashes). Hopefully XPath 2.0 will have a more palatable solution.

Disabling Default XSLT Templates

XSLT No Comments »

As XSLT developers quickly learn, the W3C XSLT Recommendation requires for all XSLT processors to implement a number of built-in rules. Per the spec, these are the built-in rules:

<xsl:template match="* | /">
  <xsl:apply-templates />
</xsl:template>

All XML elements apply child templates recursively

<xsl:template match="* | /" mode="m">
  <xsl:apply-templates mode="m" />
</xsl:template>

All XML elements apply child templates recursively for every processing mode m

<xsl:template match="text() | @*">
  <xsl:value-of select="." />
</xsl:template>

All text nodes and attributes return the value of their contents

<xsl:template match="processing-instruction() | comment()" />

All comments and processing instructions are ignored

The net effect of these implicit rules is that an XSLT stylesheet without any templates defined will simply return the string values of all child elements in the XML document concatenated together. Attribute values are ignored because <xsl:apply-templates /> only applies to child elements and text nodes, not attribute nodes.

There are many instances where these default templates are useful, but I often find they mask bugs in my stylesheet (e.g. when I mistype a template match expression). Instead, I usually prefer that the stylesheet fails if it comes across an unanticipated XML element (hopefully loudly). I use the following XSLT fragment to achieve this behavior:

<xsl:template match="*">
  <xsl:message terminate="yes">
    <xsl:text>ERROR: Unhandled XML element: </xsl:text>
    <xsl:value-of select="name(.)" />
  </xsl:message>
</xsl:template>

When I desire the default apply-templates behavior, I add an explicit handler:

<!-- Enable default apply-templates behavior for these elements -->
<xsl:template match="/a/b/c | /a/d/e | ...">
  <xsl:apply-templates />
</xsl:template>

XSLT Number Formatting Notes

XSLT No Comments »

When using XSLT’s format-number() function to format a decimal, consider using a zero in the least significant place of the decimal part of your format string. This will allow a number with a 0 integer part to display correctly.

For example:

Number format-number using #,### format-number using #,##0
12345 12,345 12,345
5 5 5
0 No output! 0

This also applies to decimals:

Number format-number using #.00 format-number using 0.00
5 5.00 5.00
0 .00 (Note no leading 0) 0.00
0.1234 .12 (Note no leading 0) 0.12
WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in