Array#sort(prop)
Peter van der Zee wrote:
No idea whether this has been discussed before, but I find myself continuously doing this when sorting arrays with objects:
arr.sort(function(a,b){ if (a.prop< b.prop) return-1; if (a.prop> b.prop) return 1; return 0; });
With arrows it is better:
arr.sort((a, b) => { if (a.prop< b.prop) return-1; if (a.prop> b.prop) return 1; return 0; });
With ?: it's even better:
arr.sort((a, b) => (a.prop< b.prop) ?-1 : (a.prop> b.prop) ? 1 : 0);
and if you weed out NaN to avoid the comparison function returning NaN, while taking advantage of the fact that the return value's sign is all that matters, not exact {-1, 0, 1} membership, then it's not much worse:
arr.sort((a, b) => { let t = a.prop - b.prop; return (t !== t) ? 0 : t; });
I'm assuming you don't need to sort -0< 0 (which evaluates to false in JS).
Another observation: this last version is the only one that avoids multiple implicit conversions via toString or valueOf of an object operand, if one or both of a and b is an object and the first condition (a.prop< b.prop) evaluates to false. This means the only portable form is the last version.
Yeah, it's a pain to write every time and people will mess up the NaN corner case. A standard built-in would be good. But why use a string parameter key instead of just a built-in function?
On 01/04/12 18:49, Peter van der Zee wrote:
No idea whether this has been discussed before, but I find myself continuously doing this when sorting arrays with objects:
arr.sort(function(a,b){ if (a.prop< b.prop) return -1; if (a.prop> b.prop) return 1; return 0; });
Couldn't we add an optional string argument to Array#sort that does this for us? If supplied do the above, otherwise behave as usual. If there's already a first arg planned for .sort (I haven't kept up), make it the second arg.
arr.sort('prop');
I believe this would be better left for a higher-order helper function, given the sort function already receives a predicate:
function compareProperty(name) { return function(a, b){ return a[name]< b[name]? -1 : a[name] !== b[name]? 1 : /* otherwise */ 0 }}
arr.sort(compareProperty('prop'))
So, this keeps the `sort' function's semantics simple and straight-forward.
This is all pretty straightforward library work. With Narwhal’s "util" module and with my previous work on Chiron, I made some small composable tools, in much the same spirit as Jeremy Ashkenas’s work on Underscore. Consider two higher order functions.
by(relation) accepts a relation (a function that accepts a value and returns a related value) and returns a comparator.
get(name) returns a relation function that retrieves the named property of an object.
These compose like:
points.sort(by(get("x"))) points.sort(by(distanceFrom({x:10,y:20})))
So that much can be done in libraries.
What can’t be done without shimming the sort function is a Schwartzian transformation. This is where you do a map/sort/map to reduce applications of the relation, which can double the performance of a sort in the most degenerate case: large completely unsorted array and a costly relation. Some years ago I found that the optimization was beneficial for arrays longer than 3 for a getter relation.
function schwartzianSort(values, relation, compare) { return values.map(function (value) { return [relation(value), value]; }).sort(function (a, b) { return compare(a[0], b[0]); }).map(function (pair) { return pair[1]; }); }
To facilitate this and the above notation, the comparator returned by the "by" function was annotated with "compare" and "by" properties which were the original relation and comparator. ("by" accepted an alternate comparator as a second argument; the default had sensible semantics for all same-type pairs of primordials and did not coerce strings to numbers, a noteworthy wart of the existing specified default comparator).
The alternate "sort" (in-place) and "sorted" (sorted copy) functions would accept any object with a "by" property (and optional "compare" property) would invoke the Schwartzian transform.
Adding support for this optimization to Array#sort would be interesting, and the interface might provide an opportunity to allow users to opt-in for a more sensible default comparator. If "sort" receives an object with "compare" or "by" properties, it could go through the new code-path. Passing a non-function object would guarantee the more sensible default "compare".
A more sensible default "compare" would sort strings lexically, and sort arrays with left-to-right precedence and recurse on each element.
Sensible invocations:
values.sort({by:relation}) values.sort({by:relation,compare:altCompare}) values.sort({compare}) // if default compare is added in global scope values.sort({}) // default comparator
Kris Kowal
On Sun, Apr 1, 2012 at 3:53 PM, Brendan Eich <brendan at mozilla.org> wrote:
arr.sort((a, b) => (a.prop< b.prop) ?-1 : (a.prop> b.prop) ? 1 : 0);
and if you weed out NaN to avoid the comparison function returning NaN, while taking advantage of the fact that the return value's sign is all that matters, not exact {-1, 0, 1} membership, then it's not much worse:
arr.sort((a, b) => { let t = a.prop - b.prop; return (t !== t) ? 0 : t; });
Quoting from es5.github.com/#x15.4.4.11, the requirements on the
comparefn are:
A function comparefn is a consistent comparison function for a set of values S if all of the requirements below are met for all values a, b, and c (possibly the same value) in the set S: The notation a <CF b means comparefn(a,b) < 0; a =CF b means comparefn(a,b) = 0 (of either sign); anda >CF b means comparefn(a,b) > 0.
Calling comparefn(a,b) always returns the same value v when given a specific pair of values aand b as its two arguments. Furthermore, Typees5.github.com/#Type (v) is Number, and v is not NaN. Note that this implies that exactly one of a <CF b, a =CF b, and a >CF b will be true for a given
pair of a and b.
Calling comparefn(a,b) does not modify the this object.
a =CF a (reflexivity)
If a =CF b, then b =CF a (symmetry)
If a =CF b and b =CF c, then a =CF c (transitivity of =CF)
If a <CF b and b <CF c, then a <CF c (transitivity of <CF)
If a >CF b and b >CF c, then a >CF c (transitivity of >CF)
NOTE
The above conditions are necessary and sufficient to ensure that *comparefn
- divides the setS into equivalence classes and that these equivalence classes are totally ordered.
If sort is called with a comparefn which violates this constraint, according to normative text
If comparefn is not undefined and is not a consistent comparison function for the elements of this arrayes5.github.com/#array-element (see
below), the behaviour of sort is implementation-defined.
a conforming implementation may kill the user, lose memory safety, or launch the nukes.
Since in this note we're not concerned about new syntax but rather the notion of "implementation-defined", let's rewrite Brendan's two sort functions in ES5:
function c1(a, b) { return a < b ? -1 : a > b ? 1 : 0; } function c2(a, b) { var t = a - b; return t !== t ? 0 : 1; }
Both of these violate transitivity of =CF
c1(3,NaN); 0 c1(NaN,4); 0 c1(3,4); -1 c2(3,NaN) 0 c2(NaN,4) 0 c2(3,4) 1
Fortunately, I have not tried passing either of these to sort, and so have lived to tell the tale.
Let's kill "implementation-defined" rather than our users.
On Sun, Apr 1, 2012 at 6:56 PM, Mark S. Miller <erights at google.com> wrote:
On Sun, Apr 1, 2012 at 3:53 PM, Brendan Eich <brendan at mozilla.org> wrote:
arr.sort((a, b) => (a.prop< b.prop) ?-1 : (a.prop> b.prop) ? 1 : 0);
and if you weed out NaN to avoid the comparison function returning NaN, while taking advantage of the fact that the return value's sign is all that matters, not exact {-1, 0, 1} membership, then it's not much worse:
arr.sort((a, b) => { let t = a.prop - b.prop; return (t !== t) ? 0 : t; });
Quoting from es5.github.com/#x15.4.4.11, the requirements on the comparefn are:
A function comparefn is a consistent comparison function for a set of values S if all of the requirements below are met for all values a, b , and c (possibly the same value) in the set S: The notation a <CF b means comparefn(a,b) < 0; a =CF b means comparefn(a,b) = 0 (of either sign); anda >CF b means comparefn(a,b) > 0.
Calling comparefn(a,b) always returns the same value v when given a specific pair of values aand b as its two arguments. Furthermore, Type es5.github.com/#Type(v) is Number, and v is not NaN. Note that this implies that exactly one of a <CF b, a =CF b, and * a* >CF b will be true for a given pair of a and b.
Calling comparefn(a,b) does not modify the this object.
a =CF a (reflexivity)
If a =CF b, then b =CF a (symmetry)
If a =CF b and b =CF c, then a =CF c (transitivity of =CF)
If a <CF b and b <CF c, then a <CF c (transitivity of <CF)
If a >CF b and b >CF c, then a >CF c (transitivity of >CF)
NOTE
The above conditions are necessary and sufficient to ensure that * comparefn* divides the setS into equivalence classes and that these equivalence classes are totally ordered.
If sort is called with a comparefn which violates this constraint, according to normative text
If comparefn is not undefined and is not a consistent comparison function for the elements of this arrayes5.github.com/#array-element (see below), the behaviour of sort is implementation-defined.
a conforming implementation may kill the user, lose memory safety, or launch the nukes.
Since in this note we're not concerned about new syntax but rather the notion of "implementation-defined", let's rewrite Brendan's two sort functions in ES5:
function c1(a, b) { return a < b ? -1 : a > b ? 1 : 0; } function c2(a, b) { var t = a - b; return t !== t ? 0 : 1;
Oops, typo. The "1" immediately above should be "t". But this doesn't affect the point.
Ok, easy goof (I was boarding a plane -- that's my excuse!). But easy to fix too.
Point is, as Peter posted, this is hard to get right if one has to write it from scratch. Finding and downloading a standard library? Why not make that the built-in all implementations provide?
Mark S. Miller wrote:
Quoting from es5.github.com/#x15.4.4.11, the requirements on the comparefn are:
A function /comparefn/ is a consistent comparison function for a set of values /S/ if all of the requirements below are met for all values /a/, /b/, and /c/ (possibly the same value) in the set /S/: The notation /a/ <_CF /b/ means /comparefn/(/a/,/b/) < 0; /a/ =_CF /b/ means /comparefn/(/a/,/b/) = 0 (of either sign); and/a/ >_CF /b/ means /comparefn/(/a/,/b/) > 0. Calling /comparefn/(/a/,/b/) always returns the same value /v/ when given a specific pair of values /a/and /b/ as its two arguments. Furthermore, Type <http://es5.github.com/#Type>(/v/) is Number, and /v/ is not NaN. Note that this implies that exactly one of /a/ <_CF /b/, /a/ =_CF /b/, and /a/ >_CF /b/ will be true for a given pair of /a/ and /b/. Calling /comparefn/(/a/,/b/) does not modify the *this* object. /a/ =_CF /a/ (reflexivity) If /a/ =_CF /b/, then /b/ =_CF /a/ (symmetry) If /a/ =_CF /b/ and /b/ =_CF /c/, then /a/ =_CF /c/ (transitivity of =_CF ) If /a/ <_CF /b/ and /b/ <_CF /c/, then /a/ <_CF /c/ (transitivity of <_CF ) If /a/ >_CF /b/ and /b/ >_CF /c/, then /a/ >_CF /c/ (transitivity of >_CF ) *NOTE* The above conditions are necessary and sufficient to ensure that /comparefn/ divides the set/S/ into equivalence classes and that these equivalence classes are totally ordered.
If sort is called with a comparefn which violates this constraint, according to normative text
If /comparefn/ is not *undefined* and is not a consistent comparison function for the elements of this array <http://es5.github.com/#array-element> (see below), the behaviour of |*sort*| is implementation-defined.
a conforming implementation may kill the user, lose memory safety, or launch the nukes.
...
Let's kill "implementation-defined" rather than our users.
IIRC there was already this topic on the list some time ago.
BTW, is it so hard just to change the spec to something like:
If /comparefn/ is not undefined and is not a consistent comparison function for the elements of this array es5.github.com/#array-element (see below), the
behaviour of |sort| can be any of:
- throwing an exception raised in comparator function by throw statement, leaving array in one of its permutations*
- changing array to one of its permutations* and returning the array
- infinite loop
- stack overflow exception / out of memory exception, leaving array in one of its permutations*
- throwing TypeError, leaving array in one of its permutations*
at the discretion of the implementor.
- not changing the array at all also counts as "leaving the array in of its permutations"
No idea whether this has been discussed before, but I find myself continuously doing this when sorting arrays with objects:
arr.sort(function(a,b){ if (a.prop < b.prop) return -1; if (a.prop >
b.prop) return 1; return 0; });
Couldn't we add an optional string argument to Array#sort that does this for us? If supplied do the above, otherwise behave as usual. If there's already a first arg planned for .sort (I haven't kept up), make it the second arg.
arr.sort('prop'); =>
arr.sort(function(a,b){ if (a[prop] < b[prop]) return -1; if (a[prop]