Skip to content

Sorting - We're Doing It Wrong

A couple of weeks ago it seemed my daily business became sorting DOMElements. This quickly became boring enough to be investigated more thoroughly. So this post sums up everything you should know about sorting DOMElements in Javascript (… using jQuery, of course).


I usually write about Array.sort rather than Array#sort. Simply because I never know when to use which. But no worries, @mathias is here to set us straight! ☺


Sorting - the basics

Array#sort() is a handy little function that sorts arrays for us. It's handy, because we don't have to implement one of the various sorting algorithms like BubbleSort, MergeSort or HeapSort. (If you studied computer science, you've probably implemented a couple of these yourself. Quite annoying, right?) ECMAScript, the specification behind Javascript, doesn't regulate which algorithm a Javascript engine vendor should use. According to this question on StackOverflow apparently Mozilla implemented MergeSort, Safari did some SelectionSort and Chrome's V8 uses QuickSort.


For kicks and laughs, here's what sorting algorithms sound like:


So the actual sorting is done for you, as long as you pass in a callback function that compares two elements according to how you want them sorted. Said function must accept two arguments (a, b) and is supposed to return an integer. 0 if a === b, -1 if a < b and 1 if a > b. Since a is already positioned before b, returning 0 and -1 yield the same result, which usually allows to remove the extra equality comparison and end up with a > b ? 1 : -1.

Stable vs. Unstable Sorting

ECMAScript neither dictates a specific algorithm, nor expects it to be stable (Array.prototype.sort). Stable sorting algorithms maintain the relative order of elements that appear to be "the same". To Array#sort two items appear the same when the comparison function returns 0. While InsertionSort and MergeSort (Apple and Mozilla) are stable, QuickSort (Google Chrome) is not (Issue 90). Chrome will sort arrays using InsertionSort if the array has 10 or less elements.

@millermedeiros wrote a little test-case to point that out. (Thanks!)

So Safari and Firefox will sort ["sed", "dolor", "ipsum", "foo", "bar", "cat", "sit", "man", "lorem", "amet", "maecennas"] (by character length) in a way that "sed" will retain first position, while Chrome will roll the dice and possibly prefer "cat" to take the pole position. The Chrome developers obviously like Kittens…

@millermedeiros suggests implementing a stable algorithm, such as MergeSort yourself, if you find yourself in need. Or, if you're as lazy as me, you might be happy with his implementation he made available with AMD-Utils.

Sorting Different Types

Let's see what happens when different types are sorted (fiddle):

var list = [Infinity, -Infinity, NaN, 0, -0, 1, -1, {}, [3], [4,5], [6], "1", "-1", "a", NaN, "z"]

// sort with internal default comparison function
list.sort();

// sort with custom comparison function
list.sort(function(a, b) {
  return a > b ? 1 : -1;
});

The results are rather interesting:

=== internal default comparison (Firefox and Chrome) ===
-1 | -1 | -Infinity | 0 | 0 | 1 | 1 | 3 | 4,5 | 6 | Infinity | NaN | NaN | [object Object] | a | z


=== custom comparison (Firefox) ===
-Infinity | -1 | -1 | 1 | 3 | Infinity | NaN | 0 | 0 | 1 | 4,5 | 6 | [object Object] | a | NaN | z
=== custom comparison (Chrome) ===
-Infinity | -1 | -1 | 0 | 0 | 1 | NaN | 1 | 3 | 4,5 | 6 | [object Object] | a | NaN | Infinity | z
  • a > b ? 1 : -1 is the most rudimentary comparison function, mentioned in pratically every single example illustrating the Array#sort() function. Yet it doesn't yield the same result as the internal default comparison function. Let's ignore the internal default comparison function for now, we're most likely not going to use it.
  • See how NaN pretty much stayed in their original spots? Comparing something to NaN seems to be a bad Idea.
  • Chrome and Firefox seem to apply different rules to casting strings and arrays to numeric values.
  • Note how Firefox and Chrome treat Infinity differently (both don't make sense, imho).

Removing everything that is not immeditely numeric (including NaN) we get the following result (across browsers):

-Infinity | -1 | 0 | 0 | 1 | Infinity

Now, that finally makes sense.

In short: sorting different types seems like a bad idea because the results will vary across browsers. We must take measures to feed the comparison function reliable data. Beware of NaN! Replace them with an Infinity if you can.

Sorting Strings

MDN suggests string comparison to be done like "a" < "b". While this works for English just fine, it fails pretty much every other language there is. According to the rules, the German Umlauts (ä, ö, ü) should come right after their respective vowels: a, ä, b, o, ö. But here's where we're facing a problem, because "ä" > "b". You see, the unicode of b is 0x62 and ä is 0xE4 and 0x62 < 0xE4. See for yourself:


var list = "ä ba bb bä bz a e è é aa ae b ss sz sa st ß".split(" ");
list.sort(function(a, b) {
  return a > b ? 1 : -1;
});

yields

a, aa, ae, b, ba, bb, bz, bä, e, sa, ss, st, sz, ß, ä, è, é

which is not quite what we (Germans, French, …) want.

Sadly MDN's Array#sort() doesn't mention the very handy function String#localeCompare. (I edited MDN to now link to that handy utility…)

For comparing strings you can simply replace a > b ? 1 : -1 by a.localeCompare(b) and your language is back in the game:


var list = "ä ba bb bä bz a e è é aa ae b ss sz sa st ß".split(" ");
list.sort(function(a, b) {
  return a.localeCompare(b);
});

yields

a, ä, aa, ae, b, ba, bä, bb, bz, e, é, è, sa, ss, ß, st, sz

which is just fine!

String#localeCompare has been around since Javascript 1.2 - pretty much every browser out there knows it - So use it!

Let it be known that localeCompare uses the operating system's locale. Which means that if you're running an English (or Scandinavian) System, you won't be seeing the characters sorted according to German (or French, or whatever) rules.

Sorting DOMElements

A trivial function to sort the children of a DOMElement looks like:

$.fn.sortChildren = function(compare) {
  var $children = this.children();
  $children.sort(compare);
  this.append($children);
  return this;
};

and would be invoked with

$('ul').sortChildren(function(a, b) {
  return $(a).text().toLowerCase() > $(b).text().toLowerCase() ? 1 : -1;
});

While this method works, it's wasting about 70% performance across the board.

Boosting Performance - jQuery

jQuery itself is a wonderful thing. It makes things really simple. But losing a bit of performance is often the toll for said simplicity. Let's ditch jQuery#append and improve performance by 20% - 30%:

$.fn.sortChildren = function(compare) {
  return this.each(function(){
    var $children = $(this).children();
    $children.sort(compare);
    for (var i=0, l=$children.length; i < l ; i++) {
      this.appendChild($children[i]);
    }
  });
};

less jQuery, more performance…

Comparing Elements

MDN (and most other resources I came across) illustrate sorting with an example like the following:

var list = ["Delta", "alpha", "CHARLIE", "bravo"];
list.sort(function(a, b) {
  return a.toLowerCase() > b.toLowerCase() ? 1 : -1;
});

The docs usually fail to inform the reader of the fact that this comparison function is executed multiple times for each element in the array. (I edited MDN to now reflect that fact) Let's take a closer look at what's happening behind the scenes using the comparison function from before:

$('ul').sortChildren(function(a, b) {
  var an = $(a).text().toLowerCase(),
    bn = $(b).text().toLowerCase();
 
  console.log(an, bn);
  return an > bn ? 1 : -1;
});

A list of five elements:

<ul>
  <li>delta</li>
  <li>bravo</li>
  <li>alpha</li>
  <li>charlie</li>
  <li>echo</li>
</ul>

And Firefox gives us the following output:

delta bravo
delta alpha
bravo alpha
charlie echo
delta charlie
alpha charlie
bravo charlie
delta charlie
delta echo

You can easily see how charlie is evaluated 5 times. That means $(a).text().toLowerCase() is executed 5 times for charlie alone. The sorting algorithm used 9 iterations leading to 18 »get text of node and convert to lower case« operations. eighteen, instead of the 5 you'd actually need. Depending on what your comparison function actually does, this can be a tremendous overhead.

Mapping Before Comparing

We cannot reduce the number of times our comparison function is executed. But we can reduce the work it actually does to a bare minimum. Apparently this is called the Schwartzian Transform (Thank you Gabriel for pointing this out) Instead of evaluating each element on every comparison, we do those evaluations once. For this we create a second array containing the values to sort

var list = ["Delta", "alpha", "CHARLIE", "bravo"],
  map = [],
  result = [];

for (var i=0, length = list.length; i < length; i++) {
  map.push({
    index: i, // remember the index within the original array
    value: list[i].toLowerCase() // evaluate the element
  });
}

// sorting the map containing the reduced values
map.sort(function(a, b) {
  return a.value > b.value ? 1 : -1;
});

// copy values in right order
for (var i=0, length = map.length; i < length; i++) {
  result.push(list[map[i].index]);
}

// print sorted list
print(result);

Using this technique, we can improve the performance of our little sorting plugin as much as 70% over the original.

Note: I don't know of any (reasonable) way to change to sorting of an array in-place. While Array#sort will modify the original array (as well as return itself), the mapped sorting approach described here will create a new array.

Working On Detached DOM

Discussing the issue with @derSchepp a couple of weeks ago, he threw in the Idea of detaching the nodes we want to sort from the DOM, so we could reduce the number of reflows (which happen when you alter the position of elements within the DOM) to a bare minimum of 2. A simple helper function (for lack of a better name) called phase() will help:

$.fn.phase = function() {
  return this.each(function() {
    var $this = $(this),
      placeholder = $this.data('redetach');

    if (placeholder) {
      placeholder.parentNode.replaceChild(this, placeholder);
      $this.removeData('redetach');
    } else {
      placeholder = document.createTextNode(''),
      this.parentNode.replaceChild(placeholder, this);
      $this.data('redetach', placeholder);
    }
  });
};

now we throw that into our little sorting function and get:

$.fn.sortChildren = function(map) {
  return this.each(function() {
    var $this = $(this).phase(),
      $children = $this.children(),
      _map = [],
      length = $children.length,
      i;

    for (i = 0; i < length ; i++) {
      _map.push({
        index: i,
        value: map($children[i])
      });
    }
     
    _map.sort(compare);

    for (i = 0; i < length ; i++) {
      this.appendChild($children[_map[i].index]);
    }
 
    $this.phase();
  });
};

Note: the .phase() function here is just a quick hack without much thought put into it. @cowboy came up with a better solution!

Benchmarking

You can't optimize what you can't quantify. Let's throw some tests at jsPerf and see how our optimizations panned out:

Surprise: the DOM-detaching thing only benefits Chrome. Firefox and Safari seem to do internal optimizations to prevent reflows on every single DOM mutation. Benefits in Chrome are far less than the losses in Firefox and Safari, so we'll just skip this step. Sorry Schepp :(

Final Sorting Plugin

In a Gist. Enjoy :)

Sorting Numbers Within Strings

This Fiddle has some ideas about how to "properly" sort numbers within strings.

Update

  • Modified mapped sorting examples according to @cowboy's suggestion
  • Added a section about stable and unstable sorting algorithms
  • Added more Info on Chrome's sorting habits

Further Reading

Comments

Display comments as Linear | Threaded

Ben Alman on :

Ben AlmanRodney, thanks for the excellent article.

I just wanted to mention that you don't need to use a placeholder element when detaching nodes from another node. All you have to do is keep track of the node-to-be-detached's parent and next sibling, and then use insertBefore to replace the detached node.

You can see an example as plain old JS:
https://gist.github.com/938767

And as a jQuery plugin:
https://gist.github.com/978520

(Which reminds me, I really have to publish this plugin)

Also, a few minor comments about your mapped sort example:

Array#sort both sorts arrays in-place and also returns the sorted array. Your example creates a new array. Which is not a problem, of course, it's just something that might be worth mentioning in the article.

Also, instead of pushing an array like [i, map(child)] onto the _map array, I'd recommend pushing an object like {index: i, sortBy: map(child)} onto the _map array. I know it's a few more bytes, but the object literal keys make the code a litle bit easier to read / learn from / maintain than the more compact array indexed syntax.

Either way, thanks again and keep up the good work!

Miller Medeiros on :

Miller MedeirosOne thing that is important to note is that in Firefox and IE Array#sort is stable, which means items keep same relative position if items are already sorted (if both return zero in the compareFn) while V8 (Chrome, node.js) does a non-stable sort. It's even more important to notice that the behavior changes based on the amount of elements in the array:

see example

So if you really need a stable sort don't use the Array#sort, implement a Merge Sort or use amd-utils/array/sort instead.

Cheers.

Jan De Poorter on :

Jan De PoorterWouldn't creating a hash as a reference work better? It'd allow you to sort the array itself, while still avoiding calculations by doing simple hash lookup. I create a fiddle as example.

Rodney Rehm on :

Rodney RehmThat would work for primitive types. Object properties ("hash keys") are strings. Your example works because you're sorting strings. It would also work for numbers and dates and basically anything with a meaningful .toString().

It would however fail DOMElements and plain objects (for they don't have an automatic and "distinct" string representation). They would all end up using the keys [object HTMLLIElement] and [object Object].

Jan De Poorter on :

Jan De PoorterOK. I see when testing it. I thought JS had a better implementation of hashes :-)

Nathan Sweet on :

Nathan SweetBrief aside : If you look at issue tracker for Chrome (http://code.google.com/p/v8/issues/detail?id=90) and the way sort is implemented in V8 (https://github.com/v8/v8/blob/master/src/array.js) you will see that it's actually incorrect that Chrome doesn't use a stable sort. It actually does most of the time. For the effective majority of cases v8 uses a mergesort. In your examples v8 would have probably used a mergesort.

I take umbrage with your statistical claims. I don't know if I should care about a 70% performance gain or not, when you give me no context. A 70% performance gain against what? If it's less than a millisecond and I'm only doing it once, I don't care, I'll absolutely produce more readable code for the trade-off. If it's something I'm doing on a continuous basis, then maybe I care, but why.

The fact that you created a plugin for this is nice, as it obfuscates all this away from us, but have you tested it using a lot of different use cases (huge sets, constantly reused small sets, etc). Was this actually worth it? Is my question I guess. I don't think this article is showing enough homework, I'm not trying to say you didn't do all your homework, but if you did it's not all here.

This comment was made in good faith. Keep up the good work Rodney!

Rodney Rehm on :

Rodney RehmI've added some more info and links regarding Chrome's sorting habits - and in fact corrected my dead wrong claim ECMAScript would ask for a stable sort.

How do you come to claim "the majority cases" would use MergeSort? The code you pointed to clearly says it's InsertionSort and clearly says it's only invoked for arrays with less than 11 elements. In my book that cannot - by any definition - be the "majority of cases".

If you care about wasting CPU time is up to you. Even if we're talking milliseconds. The point of this post was to show how the comparison function does, what (not so obvious) short-coming it has and how to potentially circumvent the problem.

Loke Locus on :

Loke LocusSorting has to happen over same type. If you are comparing integer, and Object, you would better handle that in your comparison function in a like manner with a like intention.

Anonymole on :

AnonymoleWhat's with the unreadable technique of declaring multiple vars on multiple lines separated by commas?

var a, b, c, d;

This becomes difficult to 'instantly' discern what those terms are doing hangin' out in the breeze.

var a; var b; var c; var d;

Provides INSTANT understanding what we have 4 vars.

or

var a, b, c, d;

Again, INSTANT understanding.

But line wrapping using commas - bad form, difficult to instantly interpret and just looks sloppy.

austin on :

austinanother little javascript time sink: array.splice() i found this when making a steganography app in javascript, i had an array with a bunch of elements (4 per pixel for an image) i had a second one which was an array of indicies mapping to the original array which i would pick a random number from 1-length do things to the original array and remove that index(and the next 3) using array.splice() it was horribly horribly slow. i changed it to instead change the value of that element to -1 and then just pick another random number if the element was -1 ugly but a massive speed improvement if you can avoid array.splice() do so

Gijs on :

GijsSkimming the article, I spent a while being flabbergasted at this bit:

"InsertionSort and BubbleSort (Apple and Mozilla) "

Umm, really? Mozilla uses bubblesort? Really really?

Well, no. As you point out in the introduction, they use (a stable implementation of) mergesort. Might be worth fixing this so other people aren't misled. :-)

Rodney Rehm on :

Rodney RehmWah, how did that happen? Fixed, Thanks!

Willem Mulder on :

Willem MulderHi Rodney, nice article! One suggestion:

for (var i=0, length = list.length; i < length; i++) { }

can be rewritten more easily as

for(var i in list) { }

Also, the DOM insertion 'optimization' that didn't work is because the DOM is not rerendered until running scripts have ended. This has to do with the single-threaded nature of Javascript execution in browsers! Thus, optimizing for DOM changes while the script is running is probably not really going to help as far as rendering is concerned...

Rodney Rehm on :

Rodney RehmIterating over arrays and objects has one thing to say about for..in and arrays: »Don’t use for arrays. It iterates over both array indices and property keys. There will thus be problems as soon as someone adds a property to an array.«

So no, please do NOT use for (var i in list) if list is an array.

Willem Mulder on :

Willem MulderYes, for...in is an Object iterator, but it's still useful for Arrays. You should just ensure that no Object properties have been added to the Array. But you have the same problem with using arr.length obviously: if someone has set .length to some arbitrary value, the for(... arr.length...) will not work as well. It all comes down to knowing your code and using the right functions accordingly. I really prefer readable code, and I know there's no Object properties, so I really like using for...in..!

Bruce Ingalls on :

Bruce IngallsStill doing it wrong! First, assess whether your goal is elegance, efficiency, or time to market. So far, our industry prefers the latter.

Wouldn't these DOM elements you wish to sort, require HTML entities for extended ASCII characters? Thus, the overhead of a Locale compare should not be needed.

Unfortunately, schools seem to only teach general purpose sorts. You missed that you have a targeted case, which fits a distribution sort

Gabriel Abejon on :

Gabriel AbejonThe map, sort, map idiom to reduce the cost of sorting on calculated values is often called a Schwartzian transform

Dave Tonge on :

Dave TongeThe UnderscoreJS sortBy function is a good example of mapping before sorting:

_.sortBy = function(obj, val, context) {
  var iterator = _.isFunction(val) ? val : function(obj) { return obj[val]; };
  return _.pluck(_.map(obj, function(value, index, list) {
    return {
      value : value,
      criteria : iterator.call(context, value, index, list)
    };
  }).sort(function(left, right) {
    var a = left.criteria, b = right.criteria;
    if (a === void 0) return 1;
    if (b === void 0) return -1;
    return a &lt; b ? -1 : a &gt; b ? 1 : 0;
  }), 'value');
};

Using a functional helper library like underscore often makes for much more readable code.

Kevin Ball on :

Kevin BallUnderscore's functional approach helps in so many ways... I find myself using it for more and more things that I used to use jQuery for, and not only do I end up with better, more maintainable code, but it is usually easier to write in the first place.

Creage on :

CreageStrange that nobody here mentioned the Tinysort plugin for jQ - http://tinysort.sjeiti.com

Accoding to sources it uses mapping already, and provides much more abilities on sorting collections in jQ.

Brian on :

BrianRegarding for (var i=0, length = list.length; i < length; i++) { }, I agree that for (var i in list) isn't a good idea... but what about Array.forEach()?

Switching from a basic Array.sort() here cut the time in this code in half (11ms to 6ms) for even a small number of items (8) in the CONSTANTS.COLORS object. ($.l() is a gettext equivalent)

        var colors    = Object.keys(CONSTANTS.COLORS)
          , colorsMap = [] 
          ; 

        colors.forEach(function (color, i) {
            colorsMap.push({ index: i
                           , value: $.l(color).toLowerCase()
                           });
        });

        colorsMap.sort(function (a, b) {
          return a.value &gt; b.value ? 1 : -1;
        }).map(function populate_colors_list (map) {
            var colorItem   = colors[map.index]
              , color       = CONSTANTS.COLORS[colorItem]
              , lsItemName  = 'caaBatch_colors_' + colorItem
              , $thisOption = $make('option', { 'class' : 'colorOption'
                                              , value   : colorItem
                                              }).data('default', color)
                                                .text($.l(colorItem));
            if (null === localStorage.getItem(lsItemName)) {
                localStorage.setItem(lsItemName, color);
            }
            colorOptions.push($thisOption);
        });

Nicholas C. Zakas on :

Nicholas C. ZakasI just wanted to say excellent write up. This is definitely piqued my interest in sorting performance.

Nicholas C. Zakas on :

Nicholas C. ZakasExcellent post. I just wanted to point out a slight issue with your comparators. The function you pass to Array#sort() should return three different conditions: a negative number if a < b, a positive number if a > b, and zero if a == b. In your examples, you always return either a positive or negative number, which means that two of the three cases are being handled in the same way, altering the outcome of sore.

Rodney Rehm on :

Rodney RehmI had hoped the following (third sentence below the video) would've clarified my simplification: »Since a is already positioned before b, returning 0 and -1 yield the same result, which usually allows to remove the extra equality comparison and end up with a > b ? 1 : -1.« But you're right, obviously.

Michał Czernow on :

Michał CzernowArray.forEach() is slower than for(){} as Nicolas Zakas has pointed out in his book abot efficient JS.

About preventing reflows. You can use documentFragment or just make UL display: none - and than operates on them.

Great read, learned a lot.

Gajus Kuizinas on :

Gajus KuizinasHello Rodney, I've read your article as part of the research I've made when creating jQuery table-sort plugin. It is a table sorting plugin that can sort tables with up-to 100,000 rows (didn't dare to test with more rows) within 150ms. Thank you for taking the time to blog about your experience.

josh on :

joshIf you're going to do case conversion to get a case-insensitive comparison, use UPPERCASE. Otherwise you won't match "ß" to "SS", eg. Still not as good as a real case-insensitive comparison, but what can you do?

Rodney Rehm on :

Rodney RehmNot sure what you're getting at. (At least) Firefox 15 reports

'ß'.toLowerCase() === 'ß';
'ß'.toUpperCase() === 'ß';

The author does not allow comments to this entry