Steve Workman's Blog

Improving Javascript XML Node Finding Performance by 2000%

Posted on by Steve Workman About 3 min reading time

In my work, I'm parsing web services all of the time. Most of the time, they're XML, which does not make the best use of bandwidth/CPU time (compared to JSON), however, if it's all that you're given then you can certainly get by. I've been looking into ways to speed up the XML document traversal in with jQuery after the current best practice method was removed.

The basic way to find certain nodes in an XML web service is to use the .find() method. This is used heavily by the SPServices jQuery helper (which is, in general, a great library).

$(xData.responseXML).find("\[nodeName='z:row'\]").each(function() {
// Do stuff
});

That's absolutely fine - it's going to find the attribute nodeName with a value of z:row. However, since jQuery 1.7, this method does not work. I raised this regression in the jQuery bug tracker and was encouraged to find a solution; another selector that worked in all browsers. Unfortunately, at the time I couldn't come up with anything better than this:

$(xData.responseXML).find("z\\\\:row, row").each(function() {
// Do stuff
});

The "z\\\\:row" selector works in IE and Firefox, and the "row" selector works in Chrome and Safari (I'm unable to test in Opera here, sorry). This was flagged as the solution to the problem and they wouldn't be making any fixes to the jQuery core.

After a few weeks of using this method, I noticed that the site had been slowing down, especially in IE, and I thought this new selector was the cause. So, I looked into the performance numbers using jsPerf and I raised a bug too. My first test was to see what the current solution was doing, and whether jQuery 1.7 had made things worse. Test case: https://jsperf.com/node-vs-double-select/4

https://jsperf.com/node-vs-double-select/4

So, performance in Chrome is identical for each of the selectors (and it's the same in Firefox and Safari) but IE drops nearly half of its operations because it has to perform that second selector.

It's still not very high performance though, and so I looked for other solutions.

Dmethvin suggested:

Did you try the custom plugin in the ticket? If you're having performance issues that should be much faster.

The plugin he's referring to is this:

jQuery.fn.filterNode = function(name){
   return this.filter(function(){
      return this.nodeName === name;
   });
});

This filters content by their nodeName and compares it against the name that you gave it. The issue with this is that .filter() does not traverse down the tree, staying at the level of the set of objects that it was given. Therefore, a quick solution was this:

$(xData.responseXML).children().children().children().children().children().children().children().filterNode('z:row').each(function() {
// Do stuff
});

jsPerf Test: http://jsperf.com/node-vs-double-select/1

Wow, that's about 50 times faster. Even IE beats Chrome when doing this operation. The simple reason is that it's got a smaller set of objects to go through and it's comparing a single attribute rather than parsing the text of the XML to try and find the namespaced element.

Still, I wasn't satisfied as in order to achieve that performance, I had to know how deep I was going to be going in order to retrieve the set. So, back to the bug and another suggestion by dmethvin:

If you're going that deep, use a filter function passed to .find(). How does that fare?

After a few attempts, a colleague of mine came up with this beauty:

$.fn.filterNode = function(name) {
      return this.find('\*').filter(function() {
        return this.nodeName === name;
      });
    };

jsPerf test: https://jsperf.com/node-vs-double-select/3

Using .find('*').filter() increased performance to 200x faster than the original .find('z:row') selector

I mean, wow, that's incredible. On the graph, those tiny little bits of colour are the original selectors, and those only 20% of the way up are the previous massive performance increase by using filter. It should also be noted that IE8 performance using this selector increased in jQuery 1.7 in comparison to when using jQuery 1.6.

Side-note: IE10's javascript performance is almost equal to that of Google Chrome. In comparison, IE9 (not shown) is about half of that.

The reason for this massive increase is that it's backed by native selectors. A .find('*') will translate into element.querySelectorAll('*') which is very fast when compared to doing 8 .children() calls.

Summary Dealing with large amounts of data from web services needs to be fast. Using a simple .find() on the node name no-longer works and alternatives have been investigated. The fastest method, using a short one-line plug-in, improves performance by up to 2000% compared to the old methodology.

I'll be notifying the SPServices group of this post, and hopefully they can improve the performance of their library.