XSLT Performance tip: don’t indent output

Summary: turn off xsl output indenting

When transforming XML via XSLT, make sure the output setting for indenting is turned off and honoured by your code.

Turning it off will

  • Speed up the transform time
  • Reduce the output size

How is XSLT output indent turned off?

Indenting output should be off by default. If for whatever reason it is not, it can be turned off simply by using this:

<xsl:output indent="no" />

I have seen a lot of .NET XSLT ignore this important setting. Sometimes it is because the output is written to a stream or something else which doesn’t take the output settings from the XSLT.

When transforming to an XmlWriter in .NET, be sure to correctly create the XML Writer using the overload that takes in output settings obtained from a loaded XSLT. E.g:

XslCompiledTransform xslt = GetXslt(XsltPath);

// The 2nd argument honours XSLT the output settings!
using (XmlWriter w = XmlWriter.Create(sb, xslt.OutputSettings))
{
    XsltArgumentList args = GetXsltArgs();

    XPathNavigator navigator = _someXmlData.CreateNavigator();

    if (navigator == null)
        throw new NullReferenceException("navigator");

    if (w == null)
        throw new NullReferenceException("w");

    xslt.Transform(navigator, args, w);
}

(xsl:output has many other useful attributes worth looking into.)

What kind of savings do you get?

The savings will differ for many reasons (the XML document size, the structure of the XML, the types of transformation being done etc etc), so it is hard to give a definitive savings. But here’s an illustrative example:

A few weeks back, a colleague at work was having trouble with an XSLT that was taking a long time to transform.

We started to have a look; I was expecting to find inefficient XPath, or an XML document structure that wasn’t conducive to decent transformation speed, or something like that.

However, I noticed the first thing was his output was set to be indented, and the XML input was HUGE.

So, the first thing we did was turn it off.

That alone reduced a 12 minute transform (on a 300MB document) to just 1 minute!

In another scenario, a 70K XML document was taking about 0.25 seconds to transform. Turning off indenting shaved a few more milliseconds (can’t remember exact amount now) — and saved about 2K in the output HTML that it generated.

Why can this make such a difference?

I think the specifics may vary depending on the XSLT parser you are using, but I believe it is basically this:

  • Each newline/white space/tab(s) created for the indent requires an extra text node to contain these characters which requires extra memory (though at this point may not add that much time to the transformation).
  • When the transform is then saved to a file, or written out to an output stream strings are often involved. String processing can be expensive in many programming languages, so each of these indented text nodes needs handling. For very large documents (as above) this can require a lot of unnecessary processing.

But doesn’t this make the output harder to read?

The consumer of a transformation result is likely to be another process such as another XSLT in a pipeline, another process, or even a web browser.

None of these typically care about the extra white space, which also would require more processing when loading.

If you need to view the XML, it may be worth keeping the indent off and manually opening it in a text editor that has the ability to “pretty print” it for you. (Warning: some editors and IDEs, e.g. Visual Studio, can do automatically pretty print an XML document for you when you open it, making you think the XML itself had the indented output!)

Other savings are still possible

Turning off the indent is just one of many things you can look into. Other things include the following (though your mileage may vary):

  • Use attributes instead of elements (where possible; usually this is for simple values, such as numbers, dates, and very limited strings, and where the element is not expected to be indented)
  • Look at the XML structure to see if can be improved to make XSLT processing easier
  • Cache the XSLT Processor
  • Cache the output

I’ll try to expand on some of those in future posts.

Some more detailed XSLT performance tips which also created cleaner code were covered in an earlier post.

2 thoughts on “XSLT Performance tip: don’t indent output

  1. Fanstastic! I am currently building out an entire CMS based on an API for an existing well known CMS system. One of my concerns, since the CMS I am building will act as a front-end interface only, is the performance issue of constantly speaking through the API.

  2. Thanks for the tip, who would have thought that indenting could improve the speed of the code. I need some help, Im trying to switch off the output indenting feature on my XML editor but cant seem to do it. its called Liquid XML Studio version 2011, does anyone have it and know how to solve my problem???

Comments are closed.