stable sort proposal
Just some nitpicking from an API point of view:
Array.prototype.sort(compareFn, isStable = false)
should probably be something like:
Array.prototype.sort(compareFn, options = {})
(Because who knows what extensions will be proposed in the future.)
Am 10.03.2016 um 18:59 schrieb 森建:
In my honest opinion, there's not much reason to just require the sort to be stable. Some engines have done this in the past, and the spec technically allows it. At this point, stable sorts are about as fast as unstable ones, both in theory and practice (wasn't the case 10 years ago IIRC).
arr.indexOf(arguments[0]) - arr.indexOf(arguments[1])
This will work only if arr
is a copy of an array which you are trying to sort, because the algorithm may swap those elements and so reposition them relative to other elements ...
Stable sort is useful sometimes, but it is possible to implement it in js.
On Fri, Mar 11, 2016 at 8:17 PM, Isiah Meadows <isiahmeadows at gmail.com> wrote:
In my honest opinion, there's not much reason to just require the sort to be stable. Some engines have done this in the past, and the spec technically allows it. At this point, stable sorts are about as fast as unstable ones, both in theory and practice (wasn't the case 10 years ago IIRC).
I think you meant "to not just require", yes? As in, you think the spec should require .sort() to be stable?
On Sun, Mar 13, 2016 at 12:23 PM, Vic99999 <vic99999 at yandex.ru> wrote:
Stable sort is useful sometimes, but it is possible to implement it in js.
Most things in the JS standard library are possible in userland code, so that's not a useful rebuttal to anything by itself. The question is whether it's worth requiring that the standard sort be stable. Stable sort is never bad - there are no use-cases where a stable sort gives the wrong answer while an unstable sort is correct, but the opposite is definitely true. The only question is how much slower/more expensiver stable sorting is than unstable sorting, and whether that cost is sufficient to outweigh the usefulness of a stable sort.
On Mon, Mar 14, 2016 at 2:16 PM, Tab Atkins Jr. <jackalmage at gmail.com>
wrote:
On Fri, Mar 11, 2016 at 8:17 PM, Isiah Meadows <isiahmeadows at gmail.com> wrote:
In my honest opinion, there's not much reason to just require the sort to be stable. Some engines have done this in the past, and the spec technically allows it. At this point, stable sorts are about as fast as unstable ones, both in theory and practice (wasn't the case 10 years ago IIRC).
I think you meant "to not just require", yes? As in, you think the spec should require .sort() to be stable?
On Sun, Mar 13, 2016 at 12:23 PM, Vic99999 <vic99999 at yandex.ru> wrote:
Stable sort is useful sometimes, but it is possible to implement it in js.
Most things in the JS standard library are possible in userland code, so that's not a useful rebuttal to anything by itself. The question is whether it's worth requiring that the standard sort be stable. Stable sort is never bad - there are no use-cases where a stable sort gives the wrong answer while an unstable sort is correct, but the opposite is definitely true. The only question is how much slower/more expensiver stable sorting is than unstable sorting, and whether that cost is sufficient to outweigh the usefulness of a stable sort.
ISTM a symbol could be defined to grab a handle on the implementation's
fastest possible sort logic. If the implementation has a substantially
faster unstable sort algo they'd surface it here -- otherwise this would
just delegate to the implementation's stable sort. This could be
Array.prototype.sort
, or it could also be a spec'd symbol which
Array.prototype.sort
delegates to. (Or this could all be done w/ modules,
although modules might complicate what's otherwise pretty simple to
polyfill).
But fastest-possible (likely unstable) sort logic should be opt-in, not
opt-out. If the perf difference is that relevant to your use case you
probably don't want off-the-shelf Array.prototype.sort
anyway.
@Tab Atkins Jr.,
The only question is how much slower/more expensiver stable sorting is than unstable sorting, and whether that cost is sufficient to outweigh the usefulness of a stable sort. My own experiment shows, that QuickSort (unstable) is ~20% (or more) faster
Yeah...you got the intent, though. :)
Missed the list...
---------- Forwarded message --------- From: Isiah Meadows <isiahmeadows at gmail.com>
Date: Tue, Mar 15, 2016, 07:42 Subject: Re: stable sort proposal To: Vic99999 <vic99999 at yandex.ru>
What about the Timsort? It's used in Android's Java implementation, as well as Python since 2.3. I don't know that much about the performance aspects of it outside it requiring O(n) memory (compared to QuickSort's O(log n)) and being adaptive (QuickSort is not).
On Tue, Mar 15, 2016 at 12:00 AM, Vic99999 <vic99999 at yandex.ru> wrote:
@Tab Atkins Jr.,
The only question is how much slower/more expensiver stable sorting is than unstable sorting, and whether that cost is sufficient to outweigh the usefulness of a stable sort. My own experiment shows, that QuickSort (unstable) is ~20% (or more) faster on "random" array of integers (a good case for QuickSort). Probably, there are many Chrome users who sort large integer arrays, and so Chrome devs will never switch to a stable sort. (see comments at bugs.chromium.org/p/v8/issues/detail?id=90 )
I bet the chrome team would switch it were spec'd that way -- this would put them on an even playing field with other implementations. But yes, 20+% definitely makes the case that if stable sort were spec'd, a hook to get to at any faster unstable sort should be specified as well.
But again, unstable sort should be opt-in, used by those that fully grok the implications. Devs who don't really know the difference (a large proportion, IME) are very likely to expect stable sort semantics. Correctness should trump performance for the default case.
What about the Timsort?
I cannot believe it will be faster on random int array. And TimSort is base on MergeSort and, seems, for it's worst cases it cannot be better than MergeSort. I have tried mziccard/node-timsort with my old node.js - 0.10.4 and Chrome 49 (win32) - and I see that random int array case is much slower that native in Chrome, and in node.js too if I replace "native" with a function from v8/v8/blob/master/src/js/array.js .
Perhaps, implementers will want to leave the behaviour of array.sort(comparefn)
as it was for backward compatiblity.
On Tue, Mar 15, 2016 at 8:50 AM, Vic99999 <vic99999 at yandex.ru> wrote:
What about the Timsort?
I cannot believe it will be faster on random int array. And TimSort is base on MergeSort and, seems, for it's worst cases it cannot be better than MergeSort. I have tried mziccard/node-timsort with my old node.js - 0.10.4 and Chrome 49 (win32) - and I see that random int array case is much slower that native in Chrome, and in node.js too if I replace "native" with a function from v8/v8/blob/master/src/js/array.js .
Perhaps, implementers will want to leave the behaviour of
array.sort(comparefn)
as it was for backward compatiblity.
There's no back-compat impact for switching to a stable sort; since you can't depend on the ordering of an unstable sort in the first place, changing that order (to stable) is fine. (Most likely it'll fix pages that are currently sometimes broken in small ways because they assume stability.) It's just potentially a minor speed drop.
How about an Array.prototype.stableSort(comparator?)
method? Several
languages already have something like this, anyways.
(Speaking of bugs related to unstable sorts, my blog has that problem as well - unstable sort resulting in incorrect order.)
Why should you have to opt into to stable sort? Why not the other way
around? Array.prototype.fastestSort
or some such -- or better yet, spec a
symbol for it that falls back to Array.prototype.sort
for implementations
that don't expose a faster unstable variety.
It could be the other way around. Maybe Array.prototype.sort
and
Array.prototype.fastSort
. The former is required to be stable, but the
latter wouldn't.
A polyfill could check if A.p.sort
is stable, replacing if necessary, and
alias the old one to A.p.fastSort
if it doesn't exist.
Isiah Meadows wrote:
A polyfill could check if
A.p.sort
is stable, replacing if necessary, and alias the old one toA.p.fastSort
if it doesn't exist.
How does one check/test for the stability of a black-box sort? You can't, afaik.
In my opinion, you'll never be able to rely on the stability of
Array.prototype.sort
because of backward-compatibility with older
implementations where it is unstable.
As ugly as it might be, I'd recommend a separate
Array.protytype.stableSort
(with an unambiguous name) therefore that
can be tested for existence and polyfilled in absence.
Or at least we'd need to tag the available implementations explicitly:
Array.prototype.fastSort[Symbol.isStable] = false;
Array.prototype.stableSort[Symbol.isStable] = true;
Array.prototype.sort[Symbol.isStable] = /* to be chosen by implementer */;
which has the benefit that a test if (…sort[Symbol.isStable])
will
yield a falsy default value (undefined
) in legacy implementations.
, Bergi
I'm mostly neutral on this, other than that a stable sort API should exist,
and it shouldn't involve changing how the first argument of
Array.prototype.sort
is processed.
On Thu, Mar 17, 2016 at 7:21 PM, Bergi <a.d.bergi at web.de> wrote:
Isiah Meadows wrote:
A polyfill could check if
A.p.sort
is stable, replacing if necessary, and alias the old one toA.p.fastSort
if it doesn't exist.How does one check/test for the stability of a black-box sort? You can't, afaik.
In my opinion, you'll never be able to rely on the stability of
Array.prototype.sort
because of backward-compatibility with older implementations where it is unstable.As ugly as it might be, I'd recommend a separate
Array.protytype.stableSort
(with an unambiguous name) therefore that can be tested for existence and polyfilled in absence. Or at least we'd need to tag the available implementations explicitly:
Array.prototype.fastSort[Symbol.isStable] = false; Array.prototype.stableSort[Symbol.isStable] = true; Array.prototype.sort[Symbol.isStable] = /* to be chosen by implementer */;
which has the benefit that a test
if (…sort[Symbol.isStable])
will yield a falsy default value (undefined
) in legacy implementations.
If you're planning on pessimistically assuming that legacy implementations use an unstable sort for Array#sort(), then testing for the presence of Array#fastSort() (and assuming that when it appears the Array#sort is stable) is exactly as useful as testing for the presence of Array#stableSort (and assuming that when it appears the Array#sort is unstable/fast). So, polyfilling isn't a reason to prefer one default vs the other.
On 03/18/2016 11:10, Tab Atkins Jr. wrote:
If you're planning on pessimistically assuming that legacy implementations use an unstable sort for Array#sort(), then testing for the presence of Array#fastSort() (and assuming that when it appears the Array#sort is stable) is exactly as useful as testing for the presence of Array#stableSort (and assuming that when it appears the Array#sort is unstable/fast). So, polyfilling isn't a reason to prefer one default vs the other.
That makes no sense. The presence of fastSort does not indicate that sort is stable.
The approach of sometimes using "sort" for unstable sort and sometimes for stable sort would cause too much confusion. Which sort are you getting when you call sort? If you want a sort that's guaranteed stable, call stableSort.
The argument that stable sort should have the shorter name doesn't hold much water either. C++ defines sorts named sort and stable_sort (as well as a few others) just fine. sort is by far the more popular one (by a factor of 20!) because most applications don't actually care about stability.
Waldemar
On Fri, Mar 18, 2016 at 3:57 PM, Waldemar Horwat <waldemar at google.com> wrote:
On 03/18/2016 11:10, Tab Atkins Jr. wrote:
If you're planning on pessimistically assuming that legacy implementations use an unstable sort for Array#sort(), then testing for the presence of Array#fastSort() (and assuming that when it appears the Array#sort is stable) is exactly as useful as testing for the presence of Array#stableSort (and assuming that when it appears the Array#sort is unstable/fast). So, polyfilling isn't a reason to prefer one default vs the other.
That makes no sense. The presence of fastSort does not indicate that sort is stable.
I'm confused. Did you miss some context here? No UA currently provides an Array#fastSort. We'd only do so if the method was added to the spec; such a method would only be added to the spec if we also changed the spec to require Array#sort to be stable.
So yes, the presence of Array#fastSort is a positive signal for Array#sort being stable. Just like the presence of Array#stableSort is a positive signal for Array#sort not being stable. Exception proves the rule, etc.
The approach of sometimes using "sort" for unstable sort and sometimes for stable sort would cause too much confusion. Which sort are you getting when you call sort? If you want a sort that's guaranteed stable, call stableSort.
Again, in the context that I was replying to, we're talking about a polyfill. Said polyfill would ensure that legacy implementations got a stable sort filled in for Array#sort.
The argument that stable sort should have the shorter name doesn't hold much water either. C++ defines sorts named sort and stable_sort (as well as a few others) just fine. sort is by far the more popular one (by a factor of 20!) because most applications don't actually care about stability.
Or because most people reach for the familiar short name by default. Usability directly impacts use, and most coders are familiar with only a small fraction of the standard library. I'd bet quite a lot of applications actually do care about stability, but such situations only crop up rarely in practice, and so they're just occasionally buggy.
Array#sort
is stable in Chrome 70.
twitter.com/mathias/status/1036626116654637057
All modern browsers (and IE 11) have stable sort with Array#sort
. Would you like to mention Array#sort
must be a stable sort on specification (>=ES2019)?
+1
it would avoid stable-sort workarounds like this [1], [2] for javascript-database implementations.
[1] testable, deterministic, stable-sort workaround for querying rows from db-lite kaizhu256/node-db-lite/blob/2018.4.23/lib.db.js#L1010, kaizhu256/node-db-lite/blob/2018.4.23/lib.db.js#L1010
[2] accompanying integration-tests to ensure reproducible, sorted db-queries kaizhu256/node-db-lite/blob/2018.4.23/test.js#L64, kaizhu256/node-db-lite/blob/2018.4.23/test.js#L64
kai zhu kaizhu256 at gmail.com
It seems like stable sort will be discussed in the 66th meeting of TC39. I hope for good news!
EcmaScript hasn't have the stable sort method yet. www.ecma-international.org/ecma-262/6.0/#sec-array.prototype.sort
Sure, we can create stable sort function by using
Array#sort
, by returning the number of index (e.g.arr.indexOf(arguments[0]) - arr.indexOf(arguments[1])
) when it is equal to both arguments value incomparefn
. But this is inefficient and perplexing.So, I propose below.
Array#stableSort(comparefn)
or
Array#sort(comparefn, isStable = false)