Let’s say you have an XML file which contains daily stock prices, such as the following:
<Prices>
<Price>
<Date>2006-09-01</Date>
<Open>25.89</Open>
<High>25.97</High>
<Low>25.64</Low>
<Close>25.84</Close>
<Volume>31594600</Volume>
<AdjClose>25.84</AdjClose>
</Price>
<Price>
<Date>2006-08-31</Date>
<Open>25.87</Open>
<High>25.98</High>
<Low>25.68</Low>
<Close>25.70</Close>
<Volume>26380500</Volume>
<AdjClose>25.70</AdjClose>
</Price>
...
</Prices>
Excerpt from MSFT.xml generated on 2006-09-05 from Yahoo Finance’s MSFT Historical Prices and YahooCsvToXml.py
Now let’s write an XSLT fragment which displays the Price element with the latest Date:
<xsl:for-each select="/Prices/Price">
<xsl:sort select="Date" order="descending" />
<xsl:if test="position() = 1">
<xsl:copy-of select="." />
</xsl:if>
</xsl:for-each>
What if you wanted to do this in pure XPath 1.0? Well, normally one would use something akin to Jeni Tennison’s XPath maximum ‘trick’ and write the following XPath expression:
/Prices/Price[not(preceding-sibling::Price/Date > Date or
following-sibling::Price/Date > Date)]
This expression reads “Select the Price element that doesn’t have a sibling Price element with a Date greater than this one.” (By the way, you should be careful with this XPath expression and large node sets — it is highly likely it runs in O(n2) time.)
Unfortunately the above expression doesn’t work for dates because XPath’s comparison operators only work on numbers, not strings. I tried writing the equivalent expression using Microsoft’s ms:string-compare XPath extension function but it didn’t work — I believe because it only compares two strings whereas the expression requires a function that compares a node-set to a string and returns a node-set.
As far as I can tell, the only way to perform this selection in pure XPath 1.0 is to change the original XML by converting the Date values to numbers (by removing the dashes). Hopefully XPath 2.0 will have a more palatable solution.
Recent Comments