XSLT Tips for Cleaner Code and Better Performance

XSLT is a transformation language to convert XML from one format to another (or to another text-based output).

People seem to love or hate XSLT. Some find it hard to read or strange to get used to. Yet, it can be quite elegant when coded right. So this will be the first in a series of posts to show where it can be useful (and what its pitfalls/annoyances may be), how to make best use of XSLT, etc.

This first post looks at coding style in XSLT 1.0 and XPath 1.0.

I think some frustrations at this technology come from wanting to do procedural programming with it, whereas it is really more like a functional programming language; you define what rules to act against, rather than how to determine the rules (kind of).

For example, consider the following example where a named template may be used to create a link to a product:

<xsl:template name="CreateLink">
  <xsl:param name="product" />

  <xsl:element name="a">
    <xsl:attribute name="href">
      <xsl:value-of select="'/product/?id='" /><xsl:value-of select="normalize-space($product/@id)" />
    <xsl:value-of select="$product/name" />
  </xsl:element>
</xsl:template>

I have found the above to be a common way people initially code their XSLTs. Yet, the following is far neater:

<xsl:template match="product">
  <a href="{concat('/product/?id=', normalize-space(./@id))}">
    <xsl:value-of select="./@name" />
  </a>
</xsl:template>

Not only does such neater coding become easier to read and maintain, but it can even improve performance.

(Update: As Azat rightly notes in a comment below the use of ‘./’ is redundant. That is definitely true. I should have added originally that I tend to use that to help others in the team, especially those newer to XSLT to understand the context of which element your template is running under a bit more clearly.)

Lets look at a few tips on how this may be possible (a future post will concentrate on additional performance-related tips; the tips below are primarily on coding style):

Avoid XSLT Named Templates; Use Template Match

The first coding practice that leads to code bloat and hard to read XSLT is using named templates everywhere. Named templates give a procedural feel to coding. (You define templates with names, pass parameters as needed and do some stuff). This may feel familiar to most coders, but it really misses the elegance and flexibility of XSLT.

So, instead of this:

<xsl:template name="CreateLink">
  <xsl:param name="product" />

  <-- create the link here based on the product parameter -->
</xsl:template>

<-- The above would be called from elsewhere using this: -->
<xsl:call-template name="CreateLink"<>
  <xsl:with-param name="product" select="./product" />
</xsl:call-template>

Far neater would be this:

<xsl:template match="product">
  <-- create the link here based on the product parameter -->
</xsl:template>

<-- The above would be called from elsewhere using this: -->
<xsl:apply-templates select="./product" />

The above example doesn’t look like much on its own. When you have a real stylesheet with lots of template matches, (and modes, which we look at later) this gets a lot easier to read, and cuts a LOT of code, especially when calling/applying these templates.

(Of course, each tip has exceptions; named templates can be useful for utility functions. Sometimes XSLT extension objects can be useful for that too, depending on your parser and runtime requirements. A subsequent post on XSLT performance tips will cover that.)

Avoid xsl:for-each; Use Template Match

xsl:for-each is another programming construct that would appeal to many coders. But again, it is rarely needed. Let the XSLT processor do the looping for you (it has potential to be optimised further, too).

There are some instances or XSLT parsers that may perform a bit quicker using xsl:for-each because for-each avoids the XSLT processor having to determine which of possibly many matched templates is the suitable one to execute. However, matched templates that use modes can overcome those issues to most extents, and lend to highly elegant, reusable XSLT.

You don’t have to use xsl:element or xsl:attribute

You can use xsl:element and xsl:attribute, but it leads to very bloated code.

Here are a few examples of what you can do instead. In each example we will just assume we are working with some XML that represents some kind of product (it is not important what this structure is for this discussion).

Use the element name itself rather than xsl:element

Instead of

<xsl:element name="p">
  <xsl:value-of select="$product/name" />
</xsl:element>

This is a lot cleaner to read:

<p>
  <xsl:value-of select="$product/name" />
</p>

Sometimes I prefer this:

<p><xsl:value-of select="$product/name" /></p>

Use the { } shorthand for writing values inside of attributes

Using xsl:value-of for many attributes can get verbose very quickly. There is more code to read. So the code just looks uglier and more bloated. For attributes only then, with most XSLT parsers, you can use the shorthand { as a replacement for <xsl:value-of select=” and } as a replacement for ” />.

In between { and } you just put in your normal select expression.

So, instead of

<h3>
  <xsl:attribute name="class">
    <xsl:value-of select="$product/@type" /></xsl:value-of>
  </xsl:attribute>
  <xsl:value-of select="$product/name" />
</h3>

This is a lot cleaner to read:

<h3 class="{$product/name}">
  <xsl:value-of select="$product/name" />
</h3>

Or, instead of

<xsl:element name="img">
  <xsl:attribute name="src" select="$product/image/@src" />
  <xsl:attribute name="width" select="$product/image/@width" />
  <xsl:attribute name="height" select="$product/image/@height" />
  <xsl:attribute name="alt" select="$product/image" />
  <xsl:attribute name="class" select="$product/@type" />
</xsl:element>

This is a lot cleaner to read:

<img
  src="{$product/image/@src}"
  width="{$product/image/@width}"
  height="{$product/image/@height}"
  alt="{$product/image}"
  class="{$product/@type}"
/>

The above is only put onto multiple lines for this web page. In a proper editor sometimes a one-liner is even easier to read:

<img src="{$product/image/@src}" width="{$product/image/@width}" height="{$product/image/@height}" alt="{$product/image}" class="{$product/@type}" />

The above is also looking a lot like some templating languages now, and you might see why I am wondering why there are so many proprietary ones people have to learn, when XSLT is an open, widely supported, standard with transferable skills!

The above also doesn’t show how clean the code would really be, because someone using xsl:attribute is likely to use xsl:element as well, so really we should compare the legibility of this:

<xsl:element name="h3">
  <xsl:attribute name="class">
    <xsl:value-of select="$product/@type" /></xsl:value-of>
  </xsl:attribute>
  <xsl:value-of select="$product/name" />
</xsl:element>

… versus this:

<h3 class="{$product/name}">
  <xsl:value-of select="$product/name" />
</h3>

Use template modes

Often, you will want to use a template match for totally different purposes. Rather than pass unnecessary parameters or resort to different named templates, a mode attribute on the template can do the trick.

For example, suppose you are showing an order history for some e-commerce site. Suppose you want a summary of orders at the top that anchor to the specific entries further down the page.

You can have more than one template have the same match, and use mode to differentiate or indicate what they are used for.

Consider this example. First, here is a starting point in the XSLT. The idea is to reuse the Orders element, one for summary purpose, the next for details.

<!-- starting point -->
<xsl:template match="/">
  <h1>Order summary</h1>

  <h2>Summary of orders</h2>
  <p><xsl:apply-templates select="./Orders" mode="summary-info" /></p>

  <h2>Table of orders</h2>
  <xsl:apply-templates select="./Orders" mode="order-summary-details" />
</xsl:template>

Next, we match Orders with the summary-info mode:

<xsl:template match="Orders" mode="summary-info">
  <xsl:value-of select="concat(count(./Order), ' orders, from ', ./Order[1]/@date, ' to ', ./Order[last()]/@date)" />
</xsl:template>

We can also match Orders for the order-summary-details mode. Note how the variable has also re-used the other mode to get the summary for the table’s summary attribute.

<xsl:template match="Orders" mode="order-summary-details">
  <xsl:variable name="summary">
    <xsl:apply-templates select="." mode="summary-info" />
  </xsl:variable>

  <table summary="{normalize-space($summary)}">
    <thead>
      <tr>
        <th scope="col">Order number</th>
        <th scope="col">Amount</th>
        <th scope="col">Status</th>
      <tr>
    </thead>
    <tbody>
      <xsl:apply-templates select="./Order" mode="order-summary-details" />
    </tbody>
  </table>
</xsl:template>

Note how the same mode name can be used for additional matches. This is a neat way to keep related functionality together:

<xsl:template match="Order" mode="order-summary-details">
  <tr>
    <td><a href="/order/details/?id={./@id}"><xsl:value-of select="./@id" /></a></td>
    <td><xsl:value-of select="./amount" /></td>
    <td><xsl:value-of select="./status" /></td>
  </tr>
</xsl:template>

In many real XSLTs I have written these modes can be re-used many times over. They help with performance, while maintaining this elegance/reduction of code because the XSLT processor can use that to narrow down which possible template matches to select from when looking for the one to execute.

The use of modes (and other features such as importing other XSLTs and overriding moded templates) has allowed us to create multiple sub-sites in parallel (e.g. an ecommerce site that sells books, entertainment products (CDs, DVDs, computer games, etc) that all run off the same XSLTs with some minor customisation in each sub-site. Although the actual data is different, they fall into the same XML structure — they are products after all! — thus making the XSLTs highly reusable. A future post will describe arranging XSLTs in an almost object-oriented fashion).

Use in-built functions: concat()

The concat() function allows you to remove unnecessary and excessive uses of <xsl:value-of /> statements one after the other (and with the accompanying <xsl:text> </xsl:text> type of trick to get a white space in there).

Code looks easier to read, in most cases, and typically performs better too.

Example:

Instead of this:

<xsl:value-of select="$string1" /><xsl:text> </xsl:text><xsl:value-of select="$string2" />

This is much cleaner to read:

<xsl:value-of select="concat($string1, ' ', $string2)" />

Or,

Instead of this:

<a>
	<xsl:attribute name="href">
		<xsl:value-of select="$domain" />/product/?<xsl:value-of select="$someProductId" />
	</xsl:attribute>
	<xsl:value-of select="$productDescription" />
</a>

This is much cleaner to read:

<a href="{concat($domain, '/product/?', $someProductId}">
	<xsl:value-of select="$productDescription" />
</a>

Storing a string resulting from a concat into a variable is also efficient from a performance point of view (storing node-sets does not cache the result, as in most DOM and XSLT implementations, node-sets are live collections. More on that in a future post).

(Update: Azat notes in a comment below that the above href attribute can be even further simplified into this: href="{$domain}/product/?{$someProductId}".)

Use in-built functions: boolean()

How many times have we seen code like this:

<xsl:if test="$new = 'true'"> ... </xsl:if>

While it works, it is not ideal using string comparison, especially if this kind of test is going to be repeated in a template.

It would be better to create a variable using this kind of syntax:

<xsl:variable name="isNew" select="boolean($new = 'true')" />

Then, in your code, when you need to use it, you can do things like:

<xsl:if test="$isNew"> ... </xsl:if>

or

<xsl:if test="$isNew = true()"> ... </xsl:if>

or

<xsl:if test="$isNew = false()"> ... </xsl:if>

or

<xsl:if test="not($isNew)"> ... </xsl:if>

These above variations are down to style/preference, but is better from a coding perspective than constant testing of strings. (Sometimes the calculation of what true or false means may require testing many values, such as true, True, 1, Y, etc. This can all be hidden away in that one variable declaration, and the rest of the code is unchanged.)

(Update: Azat rightly notes in a comment below that the variable declaration can be made smaller by omitting the actual boolean function so it is just this: <xsl:variable name="isNew" select="$new = 'true'">. I find the explicit use of boolean can aid with readability, especially for those new to XSLT so might be useful to retain under such situations.)

Use in-built functions: string()

Instead of this:

<xsl:variable name="mystring">my text</variable>

Consider this:

<xsl:variable name="mystring" select="'my text'" />

Or this:

<xsl:variable name="mystring" select="string('my text')" />

Or, more importantly, instead of this:

<xsl:variable name="bookTitle"><xsl:value-of select="./title" /></xsl:variable>

Consider this:

<xsl:variable name="mystring" select="string(./title)" />

Why?

  • Code is cleaner to read.
  • But it is also more optimal; casting to a string instead of storing the node will result in the variable value being cached in most XSLT processors, rather than being re-evaluated each time it is accessed. (XML nodes are live collections according to W3C which means they may change. Hence references to nodes require evaluation each time they are accessed.)

Use in-built functions: number()

For similar reasons as above to use string(), number() should be used too.

Use in-built functions: other

XPath functions such as starts-with(), string-length() are handy.

For example, it is common to see code to test for the presence of strings by testing if a variable equals the empty string (”). But as most programmers should know, it is more efficient to test for the presence of a string by testing its length. In XPath expressions you can use string-length() function for this.

For more information and full list of XPath functions, consider the following:

More tips

The above is about XPath 1.0 and XSLT 1.0. Even with the above tips, some XSLT can require more code than ideal, which XSLT 2.0 and XPath 2.0 help to address. The features in those are very useful for sure, but not as widely implemented as 1.0. My experiences are almost entirely in 1.0 which we use in live, production/run-time environments.

Here are a some additional useful tips:

Do you have any useful tips to augment/improve the above? Let me know and I will add them above

17 thoughts on “XSLT Tips for Cleaner Code and Better Performance

  1. Pingback: XSLT Profilers — onenaught.com

  2. Your code is still cluttered. The following lines are completely identical:

    select=”./Orders”
    select=”Orders”

    href=”{concat($domain, ‘/product/?’, $someProductId)}”
    href=”{$domain}/product/?{$someProductId}”

    select=”boolean($new = ‘true’)”
    select=”$new = ‘true'”

  3. @Azat: Thanks for your comment.

    Re the select and boolean: yes, that is definitely shorter, though I tend to prefer the explicit ./ for some reason. I often find it helps others in the team understand the context aspect which seems to be a tricky bit to get going with, initially.

    Re the href: I used to do this. Somewhere I read that this is slightly less performant, but I can’t remember if it was just for the MSXML processor I was using at the time or as a general thing. (Though using a compiled XSLT Processor, which MSXML also supports, and caching such a processor should minimize that issue, I guess.)

    Do you know if there is such a performance difference in any of the major processors today? Perhaps it is a minimal enough difference these days that it is worth using your suggestion in a good percentage of the cases, anyway.

    You are certainly correct however, and I should have perhaps clarified those things in the post.

    Will update and mention your suggestions when I get a moment.

  4. I understand that avoiding and using attrib=”{A}” is cleaner and faster, but is it possible to avoid outputting empty attributes (attrib=”” if A is empty)? Basically this without all the bloat:

  5. @Joshua Hewitt:

    Good point. In XSLT 1.0 I can’t think of a way. I think you would have to use xsl:if under the element and then that means using xsl:attribute in that scenario.

    I believe (though have not had any experience in it) that XSLT 2.0 has additional capabilities and syntax to allow that kind of thing.

    • Hi, I assume you mean HTML in the description element? It looks like it has been escaped with things like &. Instead of that, if you are able to, can you have the HTML be in XML format rather than escaped? If you need to, you could write the HTML as well formed XML and put it in the XHTML namespace. Your XSLT could then match those elements and copy them out.

      This approach also lets you write XSLT templates to do things like filter out code you don’t want, e.g. font tags, or more importantly script tags.

      Hope that helps!

  6. Pingback: Två stycken nya filmer - Webbteknik II - Webbteknik II

  7. Pingback: XSLT Best Practices – Development Best Practices

  8. Pingback: XSLT Best Practices – Super Development

  9. Pingback: XSLT Tips | 龚成博客

  10. Really, really an eye opening article especially about shorthand expression for most cluttered nodes like attributes, elements,… 🙂

    Thanks a lot for your time and contribution!

Comments are closed.