## Sorting - We're Doing It Wrong

A couple of weeks ago it seemed my daily business became sorting DOMElements. This quickly became boring enough to be investigated more thoroughly. So this post sums up everything you should know about sorting DOMElements in Javascript (… using jQuery, of course).

I usually write about `Array.sort` rather than `Array#sort`. Simply because I never know when to use which. But no worries, @mathias is here to set us straight! ☺

## Sorting - the basics

Array#sort() is a handy little function that sorts arrays for us. It's handy, because we don't have to implement one of the various sorting algorithms like BubbleSort, MergeSort or HeapSort. (If you studied computer science, you've probably implemented a couple of these yourself. Quite annoying, right?) ECMAScript, the specification behind Javascript, doesn't regulate which algorithm a Javascript engine vendor should use. According to this question on StackOverflow apparently Mozilla implemented MergeSort, Safari did some SelectionSort and Chrome's V8 uses QuickSort.

For kicks and laughs, here's what sorting algorithms sound like:

So the actual sorting is done for you, as long as you pass in a callback function that compares two elements according to how you want them sorted. Said function must accept two arguments (`a, b`) and is supposed to return an integer. `0` if `a === b`, `-1` if `a < b` and `1` if `a > b`. Since `a` is already positioned before `b`, returning `0` and `-1` yield the same result, which usually allows to remove the extra equality comparison and end up with `a > b ? 1 : -1`.

### Stable vs. Unstable Sorting

ECMAScript neither dictates a specific algorithm, nor expects it to be stable (Array.prototype.sort). Stable sorting algorithms maintain the relative order of elements that appear to be "the same". To `Array#sort` two items appear the same when the comparison function returns `0`. While InsertionSort and MergeSort (Apple and Mozilla) are stable, QuickSort (Google Chrome) is not (Issue 90). Chrome will sort arrays using InsertionSort if the array has 10 or less elements.

@millermedeiros wrote a little test-case to point that out. (Thanks!)

So Safari and Firefox will sort `["sed", "dolor", "ipsum", "foo", "bar", "cat", "sit", "man", "lorem", "amet", "maecennas"]` (by character length) in a way that `"sed"` will retain first position, while Chrome will roll the dice and possibly prefer `"cat"` to take the pole position. The Chrome developers obviously like Kittens…

@millermedeiros suggests implementing a stable algorithm, such as MergeSort yourself, if you find yourself in need. Or, if you're as lazy as me, you might be happy with his implementation he made available with AMD-Utils.

### Sorting Different Types

Let's see what happens when different types are sorted (fiddle):

var list = [Infinity, -Infinity, NaN, 0, -0, 1, -1, {}, [3], [4,5], [6], "1", "-1", "a", NaN, "z"]

// sort with internal default comparison function
list.sort();

// sort with custom comparison function
list.sort(function(a, b) {
return a > b ? 1 : -1;
});

The results are rather interesting:

=== internal default comparison (Firefox and Chrome) ===
-1 | -1 | -Infinity | 0 | 0 | 1 | 1 | 3 | 4,5 | 6 | Infinity | NaN | NaN | [object Object] | a | z

=== custom comparison (Firefox) ===
-Infinity | -1 | -1 | 1 | 3 | Infinity | NaN | 0 | 0 | 1 | 4,5 | 6 | [object Object] | a | NaN | z
=== custom comparison (Chrome) ===
-Infinity | -1 | -1 | 0 | 0 | 1 | NaN | 1 | 3 | 4,5 | 6 | [object Object] | a | NaN | Infinity | z
• `a > b ? 1 : -1` is the most rudimentary comparison function, mentioned in pratically every single example illustrating the Array#sort() function. Yet it doesn't yield the same result as the internal default comparison function. Let's ignore the internal default comparison function for now, we're most likely not going to use it.
• See how `NaN` pretty much stayed in their original spots? Comparing something to `NaN` seems to be a bad Idea.
• Chrome and Firefox seem to apply different rules to casting strings and arrays to numeric values.
• Note how Firefox and Chrome treat `Infinity` differently (both don't make sense, imho).

Removing everything that is not immeditely numeric (including `NaN`) we get the following result (across browsers):

-Infinity | -1 | 0 | 0 | 1 | Infinity

Now, that finally makes sense.

In short: sorting different types seems like a bad idea because the results will vary across browsers. We must take measures to feed the comparison function reliable data. Beware of `NaN`! Replace them with an `Infinity` if you can.

### Sorting Strings

MDN suggests string comparison to be done like `"a" < "b"`. While this works for English just fine, it fails pretty much every other language there is. According to the rules, the German Umlauts (ä, ö, ü) should come right after their respective vowels: `a, ä, b, o, ö`. But here's where we're facing a problem, because `"ä" > "b"`. You see, the unicode of `b` is 0x62 and `ä` is 0xE4 and `0x62 < 0xE4`. See for yourself:

var list = "ä ba bb bä bz a e è é aa ae b ss sz sa st ß".split(" ");
list.sort(function(a, b) {
return a > b ? 1 : -1;
});

yields

a, aa, ae, b, ba, bb, bz, bä, e, sa, ss, st, sz, ß, ä, è, é

which is not quite what we (Germans, French, …) want.

Sadly MDN's Array#sort() doesn't mention the very handy function String#localeCompare. (I edited MDN to now link to that handy utility…)

For comparing strings you can simply replace `a > b ? 1 : -1` by `a.localeCompare(b)` and your language is back in the game:

var list = "ä ba bb bä bz a e è é aa ae b ss sz sa st ß".split(" ");
list.sort(function(a, b) {
return a.localeCompare(b);
});

yields

a, ä, aa, ae, b, ba, bä, bb, bz, e, é, è, sa, ss, ß, st, sz

which is just fine!

`String#localeCompare` has been around since Javascript 1.2 - pretty much every browser out there knows it - So use it!

Let it be known that `localeCompare` uses the operating system's locale. Which means that if you're running an English (or Scandinavian) System, you won't be seeing the characters sorted according to German (or French, or whatever) rules.

## Sorting DOMElements

A trivial function to sort the children of a DOMElement looks like:

\$.fn.sortChildren = function(compare) {
var \$children = this.children();
\$children.sort(compare);
this.append(\$children);
return this;
};

and would be invoked with

\$('ul').sortChildren(function(a, b) {
return \$(a).text().toLowerCase() > \$(b).text().toLowerCase() ? 1 : -1;
});

While this method works, it's wasting about 70% performance across the board.

## Boosting Performance - jQuery

jQuery itself is a wonderful thing. It makes things really simple. But losing a bit of performance is often the toll for said simplicity. Let's ditch `jQuery#append` and improve performance by 20% - 30%:

\$.fn.sortChildren = function(compare) {
return this.each(function(){
var \$children = \$(this).children();
\$children.sort(compare);
for (var i=0, l=\$children.length; i < l ; i++) {
this.appendChild(\$children[i]);
}
});
};

less jQuery, more performance…

## Comparing Elements

MDN (and most other resources I came across) illustrate sorting with an example like the following:

var list = ["Delta", "alpha", "CHARLIE", "bravo"];
list.sort(function(a, b) {
return a.toLowerCase() > b.toLowerCase() ? 1 : -1;
});

The docs usually fail to inform the reader of the fact that this comparison function is executed multiple times for each element in the array. (I edited MDN to now reflect that fact) Let's take a closer look at what's happening behind the scenes using the comparison function from before:

\$('ul').sortChildren(function(a, b) {
var an = \$(a).text().toLowerCase(),
bn = \$(b).text().toLowerCase();

console.log(an, bn);
return an > bn ? 1 : -1;
});

A list of five elements:

<ul>
<li>delta</li>
<li>bravo</li>
<li>alpha</li>
<li>charlie</li>
<li>echo</li>
</ul>

And Firefox gives us the following output:

delta bravo
delta alpha
bravo alpha
charlie echo
delta charlie
alpha charlie
bravo charlie
delta charlie
delta echo

You can easily see how `charlie` is evaluated 5 times. That means `\$(a).text().toLowerCase()` is executed 5 times for charlie alone. The sorting algorithm used 9 iterations leading to 18 »get text of node and convert to lower case« operations. eighteen, instead of the 5 you'd actually need. Depending on what your comparison function actually does, this can be a tremendous overhead.

## Mapping Before Comparing

We cannot reduce the number of times our comparison function is executed. But we can reduce the work it actually does to a bare minimum. Apparently this is called the Schwartzian Transform (Thank you Gabriel for pointing this out) Instead of evaluating each element on every comparison, we do those evaluations once. For this we create a second array containing the values to sort

var list = ["Delta", "alpha", "CHARLIE", "bravo"],
map = [],
result = [];

for (var i=0, length = list.length; i < length; i++) {
map.push({
index: i, // remember the index within the original array
value: list[i].toLowerCase() // evaluate the element
});
}

// sorting the map containing the reduced values
map.sort(function(a, b) {
return a.value > b.value ? 1 : -1;
});

// copy values in right order
for (var i=0, length = map.length; i < length; i++) {
result.push(list[map[i].index]);
}

// print sorted list
print(result);

Using this technique, we can improve the performance of our little sorting plugin as much as 70% over the original.

Note: I don't know of any (reasonable) way to change to sorting of an array in-place. While `Array#sort` will modify the original array (as well as return itself), the mapped sorting approach described here will create a new array.

## Working On Detached DOM

Discussing the issue with @derSchepp a couple of weeks ago, he threw in the Idea of detaching the nodes we want to sort from the DOM, so we could reduce the number of reflows (which happen when you alter the position of elements within the DOM) to a bare minimum of 2. A simple helper function (for lack of a better name) called `phase()` will help:

\$.fn.phase = function() {
return this.each(function() {
var \$this = \$(this),
placeholder = \$this.data('redetach');

if (placeholder) {
placeholder.parentNode.replaceChild(this, placeholder);
\$this.removeData('redetach');
} else {
placeholder = document.createTextNode(''),
this.parentNode.replaceChild(placeholder, this);
\$this.data('redetach', placeholder);
}
});
};

now we throw that into our little sorting function and get:

\$.fn.sortChildren = function(map) {
return this.each(function() {
var \$this = \$(this).phase(),
\$children = \$this.children(),
_map = [],
length = \$children.length,
i;

for (i = 0; i < length ; i++) {
_map.push({
index: i,
value: map(\$children[i])
});
}

_map.sort(compare);

for (i = 0; i < length ; i++) {
this.appendChild(\$children[_map[i].index]);
}

\$this.phase();
});
};

Note: the `.phase()` function here is just a quick hack without much thought put into it. @cowboy came up with a better solution!

## Benchmarking

You can't optimize what you can't quantify. Let's throw some tests at jsPerf and see how our optimizations panned out:

Surprise: the DOM-detaching thing only benefits Chrome. Firefox and Safari seem to do internal optimizations to prevent reflows on every single DOM mutation. Benefits in Chrome are far less than the losses in Firefox and Safari, so we'll just skip this step. Sorry Schepp :(

## Final Sorting Plugin

In a Gist. Enjoy :)

## Sorting Numbers Within Strings

This Fiddle has some ideas about how to "properly" sort numbers within strings.

### Update

• Modified mapped sorting examples according to @cowboy's suggestion

#### Ben Alman on Tuesday, May 29. 2012:

Rodney, thanks for the excellent article.

I just wanted to mention that you don't need to use a placeholder element when detaching nodes from another node. All you have to do is keep track of the node-to-be-detached's parent and next sibling, and then use insertBefore to replace the detached node.

You can see an example as plain old JS:
https://gist.github.com/938767

And as a jQuery plugin:
https://gist.github.com/978520

(Which reminds me, I really have to publish this plugin)

Array#sort both sorts arrays in-place and also returns the sorted array. Your example creates a new array. Which is not a problem, of course, it's just something that might be worth mentioning in the article.

Also, instead of pushing an array like `[i, map(child)]` onto the `_map` array, I'd recommend pushing an object like `{index: i, sortBy: map(child)}` onto the `_map` array. I know it's a few more bytes, but the object literal keys make the code a litle bit easier to read / learn from / maintain than the more compact array indexed syntax.

Either way, thanks again and keep up the good work!

#### Miller Medeiros on Tuesday, May 29. 2012:

One thing that is important to note is that in Firefox and IE `Array#sort` is stable, which means items keep same relative position if items are already sorted (if both return zero in the `compareFn`) while V8 (Chrome, node.js) does a non-stable sort. It's even more important to notice that the behavior changes based on the amount of elements in the array:

see example

So if you really need a stable sort don't use the `Array#sort`, implement a Merge Sort or use amd-utils/array/sort instead.

Cheers.

#### Jan De Poorter on Tuesday, May 29. 2012:

Wouldn't creating a hash as a reference work better? It'd allow you to sort the array itself, while still avoiding calculations by doing simple hash lookup. I create a fiddle as example.

#### Rodney Rehm on Tuesday, May 29. 2012:

That would work for primitive types. Object properties ("hash keys") are strings. Your example works because you're sorting strings. It would also work for numbers and dates and basically anything with a meaningful `.toString()`.

It would however fail DOMElements and plain objects (for they don't have an automatic and "distinct" string representation). They would all end up using the keys `[object HTMLLIElement]` and `[object Object]`.

#### Jan De Poorter on Tuesday, May 29. 2012:

OK. I see when testing it. I thought JS had a better implementation of hashes :-)

#### Nathan Sweet on Wednesday, May 30. 2012:

Brief aside : If you look at issue tracker for Chrome (http://code.google.com/p/v8/issues/detail?id=90) and the way sort is implemented in V8 (https://github.com/v8/v8/blob/master/src/array.js) you will see that it's actually incorrect that Chrome doesn't use a stable sort. It actually does most of the time. For the effective majority of cases v8 uses a mergesort. In your examples v8 would have probably used a mergesort.

I take umbrage with your statistical claims. I don't know if I should care about a 70% performance gain or not, when you give me no context. A 70% performance gain against what? If it's less than a millisecond and I'm only doing it once, I don't care, I'll absolutely produce more readable code for the trade-off. If it's something I'm doing on a continuous basis, then maybe I care, but why.

The fact that you created a plugin for this is nice, as it obfuscates all this away from us, but have you tested it using a lot of different use cases (huge sets, constantly reused small sets, etc). Was this actually worth it? Is my question I guess. I don't think this article is showing enough homework, I'm not trying to say you didn't do all your homework, but if you did it's not all here.

This comment was made in good faith. Keep up the good work Rodney!

#### Rodney Rehm on Thursday, May 31. 2012:

How do you come to claim "the majority cases" would use MergeSort? The code you pointed to clearly says it's InsertionSort and clearly says it's only invoked for arrays with less than 11 elements. In my book that cannot - by any definition - be the "majority of cases".

If you care about wasting CPU time is up to you. Even if we're talking milliseconds. The point of this post was to show how the comparison function does, what (not so obvious) short-coming it has and how to potentially circumvent the problem.

#### Loke Locus on Wednesday, May 30. 2012:

Sorting has to happen over same type. If you are comparing integer, and Object, you would better handle that in your comparison function in a like manner with a like intention.

#### Anonymole on Wednesday, May 30. 2012:

What's with the unreadable technique of declaring multiple vars on multiple lines separated by commas?

var a, b, c, d;

This becomes difficult to 'instantly' discern what those terms are doing hangin' out in the breeze.

var a; var b; var c; var d;

Provides INSTANT understanding what we have 4 vars.

or

var a, b, c, d;

Again, INSTANT understanding.

But line wrapping using commas - bad form, difficult to instantly interpret and just looks sloppy.

#### austin on Wednesday, May 30. 2012:

another little javascript time sink: array.splice() i found this when making a steganography app in javascript, i had an array with a bunch of elements (4 per pixel for an image) i had a second one which was an array of indicies mapping to the original array which i would pick a random number from 1-length do things to the original array and remove that index(and the next 3) using array.splice() it was horribly horribly slow. i changed it to instead change the value of that element to -1 and then just pick another random number if the element was -1 ugly but a massive speed improvement if you can avoid array.splice() do so

#### Gijs on Friday, June 8. 2012:

Skimming the article, I spent a while being flabbergasted at this bit:

"InsertionSort and BubbleSort (Apple and Mozilla) "

Umm, really? Mozilla uses bubblesort? Really really?

Well, no. As you point out in the introduction, they use (a stable implementation of) mergesort. Might be worth fixing this so other people aren't misled. :-)

#### Rodney Rehm on Friday, June 8. 2012:

Wah, how did that happen? Fixed, Thanks!

#### Willem Mulder on Friday, June 8. 2012:

Hi Rodney, nice article! One suggestion:

for (var i=0, length = list.length; i < length; i++) { }

can be rewritten more easily as

for(var i in list) { }

Also, the DOM insertion 'optimization' that didn't work is because the DOM is not rerendered until running scripts have ended. This has to do with the single-threaded nature of Javascript execution in browsers! Thus, optimizing for DOM changes while the script is running is probably not really going to help as far as rendering is concerned...

#### Rodney Rehm on Friday, June 8. 2012:

Iterating over arrays and objects has one thing to say about `for..in` and arrays: »Don’t use for arrays. It iterates over both array indices and property keys. There will thus be problems as soon as someone adds a property to an array.«

So no, please do NOT use `for (var i in list)` if list is an array.

#### Willem Mulder on Monday, June 11. 2012:

Yes, for...in is an Object iterator, but it's still useful for Arrays. You should just ensure that no Object properties have been added to the Array. But you have the same problem with using arr.length obviously: if someone has set .length to some arbitrary value, the for(... arr.length...) will not work as well. It all comes down to knowing your code and using the right functions accordingly. I really prefer readable code, and I know there's no Object properties, so I really like using for...in..!

#### Bruce Ingalls on Friday, June 8. 2012:

Still doing it wrong! First, assess whether your goal is elegance, efficiency, or time to market. So far, our industry prefers the latter.

Wouldn't these DOM elements you wish to sort, require HTML entities for extended ASCII characters? Thus, the overhead of a Locale compare should not be needed.

Unfortunately, schools seem to only teach general purpose sorts. You missed that you have a targeted case, which fits a distribution sort

#### Gabriel Abejon on Friday, June 8. 2012:

The map, sort, map idiom to reduce the cost of sorting on calculated values is often called a Schwartzian transform

#### Brian Cardarella on Friday, June 8. 2012:

Wow, excellent post.

#### Dave Tonge on Saturday, June 9. 2012:

The UnderscoreJS sortBy function is a good example of mapping before sorting:

``````_.sortBy = function(obj, val, context) {
var iterator = _.isFunction(val) ? val : function(obj) { return obj[val]; };
return _.pluck(_.map(obj, function(value, index, list) {
return {
value : value,
criteria : iterator.call(context, value, index, list)
};
}).sort(function(left, right) {
var a = left.criteria, b = right.criteria;
if (a === void 0) return 1;
if (b === void 0) return -1;
return a &lt; b ? -1 : a &gt; b ? 1 : 0;
}), 'value');
};
``````

Using a functional helper library like underscore often makes for much more readable code.

#### Kevin Ball on Monday, July 2. 2012:

Underscore's functional approach helps in so many ways... I find myself using it for more and more things that I used to use jQuery for, and not only do I end up with better, more maintainable code, but it is usually easier to write in the first place.

#### Creage on Saturday, June 9. 2012:

Strange that nobody here mentioned the Tinysort plugin for jQ - http://tinysort.sjeiti.com

Accoding to sources it uses mapping already, and provides much more abilities on sorting collections in jQ.

#### Brian on Friday, June 15. 2012:

Regarding for (var i=0, length = list.length; i < length; i++) { }, I agree that for (var i in list) isn't a good idea... but what about Array.forEach()?

Switching from a basic Array.sort() here cut the time in this code in half (11ms to 6ms) for even a small number of items (8) in the CONSTANTS.COLORS object. (\$.l() is a gettext equivalent)

``````        var colors    = Object.keys(CONSTANTS.COLORS)
, colorsMap = []
;

colors.forEach(function (color, i) {
colorsMap.push({ index: i
, value: \$.l(color).toLowerCase()
});
});

colorsMap.sort(function (a, b) {
return a.value &gt; b.value ? 1 : -1;
}).map(function populate_colors_list (map) {
var colorItem   = colors[map.index]
, color       = CONSTANTS.COLORS[colorItem]
, lsItemName  = 'caaBatch_colors_' + colorItem
, \$thisOption = \$make('option', { 'class' : 'colorOption'
, value   : colorItem
}).data('default', color)
.text(\$.l(colorItem));
if (null === localStorage.getItem(lsItemName)) {
localStorage.setItem(lsItemName, color);
}
colorOptions.push(\$thisOption);
});
``````

#### Nicholas C. Zakas on Wednesday, August 1. 2012:

I just wanted to say excellent write up. This is definitely piqued my interest in sorting performance.

#### Nicholas C. Zakas on Wednesday, August 1. 2012:

Excellent post. I just wanted to point out a slight issue with your comparators. The function you pass to Array#sort() should return three different conditions: a negative number if a < b, a positive number if a > b, and zero if a == b. In your examples, you always return either a positive or negative number, which means that two of the three cases are being handled in the same way, altering the outcome of sore.

#### Rodney Rehm on Wednesday, August 1. 2012:

I had hoped the following (third sentence below the video) would've clarified my simplification: »Since a is already positioned before b, returning 0 and -1 yield the same result, which usually allows to remove the extra equality comparison and end up with a > b ? 1 : -1.« But you're right, obviously.

#### Michał Czernow on Saturday, August 4. 2012:

Array.forEach() is slower than for(){} as Nicolas Zakas has pointed out in his book abot efficient JS.

About preventing reflows. You can use documentFragment or just make UL display: none - and than operates on them.

#### Gajus Kuizinas on Monday, September 3. 2012:

Hello Rodney, I've read your article as part of the research I've made when creating jQuery table-sort plugin. It is a table sorting plugin that can sort tables with up-to 100,000 rows (didn't dare to test with more rows) within 150ms. Thank you for taking the time to blog about your experience.

#### josh on Monday, September 10. 2012:

If you're going to do case conversion to get a case-insensitive comparison, use UPPERCASE. Otherwise you won't match "ß" to "SS", eg. Still not as good as a real case-insensitive comparison, but what can you do?

#### Rodney Rehm on Monday, September 10. 2012:

Not sure what you're getting at. (At least) Firefox 15 reports

``````'ß'.toLowerCase() === 'ß';
'ß'.toUpperCase() === 'ß';
``````