short-circuiting Array.prototype.reduce
My initial feedback is that this needs a lot more "why" in comparison to the "how". The only inkling of why this might be useful is an unsourced assertion that it's done in Clojure, for unknown reasons. The example code isn't very compelling either; something more real-world would be good there.
On Sun, Feb 22, 2015 at 10:11 PM, Domenic Denicola <d at domenic.me> wrote:
My initial feedback is that this needs a lot more "why" in comparison to the "how".
The technical reason for this I guess, is that JS doesn't have TCP blocks,
that would allow you to stop iteration, and exit the reduce
context right
from the callback context. With TCP it would be a return
statement, which
in JS we have to solve throwing a special "marker" exception, which should
be caught and analyzed.
From the practical perspective, yeah, it would be good to see real examples
of how useful the feature is. Of course if we take the reason: "exit from an hight-order iteration as soon as possible in a convenient way", that could sound reasonable/practical. Although, to have concrete examples would be good.
However, I'd say, JS betters needs some sort of TCP blocks, which would
solve other similar cases (and which I'm pretty sure were discussed several
times couple of years ago). E.g. in Ruby that example would be much easier,
and that @@reduced
would be just simple intuitive return
.
Dmitry
We still an an option open to us, that would merely compatibly remove a restriction from the language rather than add a feature: Allow labels to remain visible across function boundaries, or at least across arrow function boundaries. Then one could break-to-label to a label of a lexically enclosing function.
This has repeatedly died in committee before because it does raise another case for implementors: Once the execution context of that label has completed, an attempt to break to that label must instead throw an error. I've never understood why this extra case was a show stopper.
Previous iterations of the proposal never raised the possibility of restricting this new ability to arrow functions. I don't see why we should, but I can see why it would seem more parsimonious -- since arrow functions are trying to be much more TCB than other functions. Perhaps that restriction will make this more palatable? In any case, even if we first allow this only for arrow functions, we'd still have the option to compatibly remove the restriction for other functions later.
My other suspicion: The previous failure of this proposal was before many people had much hands on experience using higher order functions in JS as a normal alternative to control structures. Now that we all have, the need for a non-local escape may be more visceral.
We actually explored this alternative quite extensively prior to adopting arrow functions and had "block lambdas" with label support pretty much fully spec'ed. What killed them for me wasn't issues involving BLs that outlived their home function, that was easy enough to deal with. The major problem was the complexity involved in making labelled early exit syntax (break and particularly continue) work as might be expected when used in the context of control abstraction functions. For example, consider:
for (v of O) {
CtlLib.doWhile(()=>true, () {
if ( condition1(v) break;
if (condition2(v) continue;
//do something here
});
}
or
outer: for (v of O) {
inner: CtlLib.doWhile(()=>true, () {
if ( condition1(v) break outer;
if (condition2(v) continue inner;
//do something here
});
}
What should happen for the 'break' and 'continue'? What will a programmer think should happen. If we really are claiming TC for such arrows, then presumably programmers should be thinking about abstractions like CtlLib.doWhile as being on the same footing as built-in syntactic control structures such as the 'while' statement. In that case, a reasonable (more strongly put, necessary) expectation is that unlabeled break/continue must refer to the closest enclosing control structure which is the DtlLib.doWhile.
To make that work, means that the control abstraction function must have an opportunity to define what it means for it to 'continue' . Accomplishing that requires coming up with a way for reify the specification's "completion value" mechanism and labeling semantics so that the JS code can intercept and respond to them. For example, it probably requires some like a labelledArray(labelSet, thisValue, args) function for invoking a function with the intent of receiving a reified completion value. There were also issues relating to possible interactions between labeled syntactic control structures used by the control abstraction function itself and the use of labels and early exits by the consumer of the control abstract.
This would have added significant complexity to the language. It also adds significant complexity for the implementors of control abstraction functions.
For me, at least, this passed the point of diminishing returns.
There is a fork in the road of language design, you can take the fork of c-style syntactic control structures with labeled early exits. Or you can take the path of user definable control abstraction functions built upon TCP. Each path works well, within the bounds of its particular limitations. But trying to rejoin those two paths in a manner that combines all the features of both leads you into a complexity swamp.
On 23 February 2015 at 16:22, Mark S. Miller <erights at google.com> wrote:
We still an an option open to us, that would merely compatibly remove a restriction from the language rather than add a feature: Allow labels to remain visible across function boundaries, or at least across arrow function boundaries. Then one could break-to-label to a label of a lexically enclosing function.
This has repeatedly died in committee before because it does raise another case for implementors: Once the execution context of that label has completed, an attempt to break to that label must instead throw an error. I've never understood why this extra case was a show stopper.
I opposed it (and still firmly do) not just because this has considerable implementation complexity (each call could then have an arbitrary and unknown number of exit points, not just two, and I don't even want to imagine what that would do to our control graph) -- that is just a consequence of the real problem, which is that this is an unstructured form of non-local control transfer, a half-way, half-broken form of continuations. If you want continuations, add real continuations.
On Sun, Feb 22, 2015 at 11:36 PM, Dmitry Soshnikov <dmitry.soshnikov at gmail.com> wrote:
The technical reason for this I guess, is that JS doesn't have TCP blocks, that would allow you to stop iteration, and exit the
reduce
context right from the callback context. With TCP it would be areturn
statement, which in JS we have to solve throwing a special "marker" exception, which should be caught and analyzed.
I don't see the relation to the OP's problem. A break would be a very poor device for exiting a reduce, since it doesn't allow you to pass a result (you'd have to funnel that through assigning some outer variable). Also, I don't see why you can't do the same with a throw today.
What Lee really wants is a partial reduce. I think this could be provided easily without magic symbols by using the encoding of an Either type, like e.g. iterators already do (though it's used here in the inverse direction). For example, you could imagine the following signature:
reducePartial<A, B>(f: (acc: A, val: B, key: number) => {value: A, done: boolean}, init: A): A
On Mon, Feb 23, 2015 at 10:09 AM, Andreas Rossberg <rossberg at google.com> wrote:
I don't see the relation to the OP's problem.
It tends to break traversing by returning a special marker. The same issue
exists e.g. with forEach
method, not only reduce
. The reduce
seemed
to me just a particular case, that's why I mentioned TCP, that could solve
both.
If there are actual limitations with current state of the spec (and
complexities of implementations), it can be just that @@reduced
. Though
then again you'd have the case like:
$stopTraversal = {};
try {
bigData.forEach((v) => {
if (v > 100) {
$stopTraversal._returnValue = v;
throw $stopTraversal;
}
});
} catch (ex) {
if (ex === $stopTraversal) {
_data = $stopTraversal._returnValue;
}
}
If this is something more convenient than:
bigData.forEach((v) { // note, it's not an arrow, but TCP block
if (v > 100) {
_data = v;
return; // could be `exit`;
}
});
For the reduce
method it could actually return '99+'
. That's said, if
the practical need exists specifically for the reduce
function, I'm not
against having "stop traversal" mechanism only there.
Mark S. Miller wrote:
My other suspicion: The previous failure of this proposal was before many people had much hands on experience using higher order functions in JS as a normal alternative to control structures. Now that we all have, the need for a non-local escape may be more visceral.
Just in case anyone wants my historical two cents, I don't think this is true. I championed both strawman:block_lambda_revival and strawman:arrow_function_syntax starting well after (spring 2011) the modern (Prototype.js, 2005?) higher-order function revival in JS.
Anyway, however much more momentum there is today compared to four years ago, we still don't have a clear winner. But we've been over this ground. I dug up some more links in a few minutes of site: googling.
Dave Herman proposed return to label here: strawman:return_to_label This led to (among others): esdiscuss.org/topic/march-24-meeting-notes#content-13 where Andreas Rossberg proposed return from
to address the problem
cited in this thread's o.p. His example used forEach, but no matter:
function f(o) {
o.forEach(function g(x) {
if (...) return 0 from f;
let x2 = x.map(function h(y) {
if (...) return from g
return y*2 // returns from h
})
...
})
return 27
}
Thanks for this additional context, Brendan. The block lambda revival was particularly interesting to read up on. I understand why we went down the arrow function path, but it’s fun to think about what ES6+ might look like had we taken that different path.
I’d like to keep this proposal scoped specifically to reduce because I believe we already have adequate tools for the early exit of generic iteration with “for of / break/return", or at the very least, “forEach / throw”. Reduce is special member of the higher order functions in that most of the others list higher order functions can be implemented in terms of it, which is why they’re sometimes called “folds” or “reducers”.
Examples include
function map(fn, array) {
return array.reduce((a, v) => a.concat([fn(v)]), []);
}
function filter(fn, array) {
return array.reduce((a, v) => fn(v) ? a.concat([v]) : a, []);
}
However there are a bunch of higher order functions which can’t be implemented in terms of reduce without the ability to early exit: “takeN”, “takeWhile”, “takeUntil” are a few good examples of this case.
Could you implement these functions without reduce? Of course you can. But it would be nice to be able to leverage the reduce function for this purpose, especially when using JavaScript in a pure functional way.
A bit beyond the point of the proposal, but perhaps relevant is that @@reduced could be useful in user-land code as well. Example: Transducers is a newer concept that has a couple user-land libraries out there (here’s one cognitect-labs.github.io/transducers-js/classes/transducers.html) which extends the reducers concept. There’s some minor gymnastics done there to return and detect “reduced” values. Having a realm-shared well known symbol to represent this general concept would very likely be reused in libraries like these.
Since nobody's mentioned it yet, I wanted to add that Array#{some, every,
find, findIndex} are the only existing Array methods (I'm sure I missed
something though) that take iteration functions with early-exit behavior,
and none of them can be implemented with Array#reduce
in an efficient
fashion.
The example code isn't very compelling either; something more real-world would be good
I recently ran across a usage of reduce(..)
that could have benefitted from an early return. Figured I'd just drop it here for posterity sake, in case anything ever comes of this idea.
I had an array of arrays (2-dimensional array), where each inner array had a list of simple string/number elements (none of them falsy) that could have some repetition within the inner array. I wanted to filter the outer array (the list of the inner arrays) based on whether an inner array had any adjacent duplicates. That is, [1,4,2,4]
is fine to keep but [2,4,4,5]
should be filtered out.
Since reduce(..)
conveniently can compare two adjacent elements if you always return the current value, I decided to model the inner check as a reduce(..)
that reduces from the original array value to either a false
or a truthy value (the last element of the inner array element). This reduction result then is how filter(..)
decides to keep or discard. The reason an early exit would be nice is that as soon as you run across an adjacency-duplication, no more reduction is necessary -- you can immediately "reduce" to false
.
Here's how I did it, which worked but which was slightly less appealing:
var outer = [
// [1,2,1,3,4,2]
// ["foo","bar","bar",10,"foo"]
// ..
];
outer = outer.filter(function filterer(inner){
return inner.reduce(function reducer(prev,current){
if (prev === false || prev === current) return false;
return current;
});
});
The reduction initial-value is omitted, so it's undefined
, which never matches any of the inner
contents.
The prev === false
check is the way that I fake the "early exit", by which once the reduction value is tripped to false
, that's always the result for the rest of the reduction.
There's lots of other ways to "slice" that problem, I know. But reduction was attractive except for its lack of early exit.
Kyle Simpson schrieb:
Since
reduce(..)
conveniently can compare two adjacent elements if you always return the current value, I decided to model the inner check as areduce(..)
that reduces from the original array value to either afalse
or a truthy value (the last element of the inner array element).
Um, that's not exactly what reduction is meant for. The reduce method is designed so that the return values and the accumulator argument do have the same type. In your example, you have somehow mixed an expected boolean result with the item type of the array. This leads to several bug in your implementation, which doesn't work
- with lists of booleans
- with empty arrays
- with arrays whose last element is falsy
The example code isn't very compelling either; something more
real-world would be good
Well, basically all operations that return an absorbing element en.wikipedia.org/wiki/Absorbing_element would benefit from this.
Bergi
Um, that's not exactly what reduction is meant for.
There's lots of different ways reduce(..)
gets used in the wild; I can list several entirely distinct but common idioms right off the top of my head. Just because it's not the original intent doesn't mean it's invalid to do so.
To the point of the earlier question I was addressing, I was just giving an example of real code, in a very specific circumstance, where I would have liked early-exit. It was not a broad endorsement of the presented idiom as a general pattern.
The reduce method is designed so that the return values and the accumulator argument do have the same type.
In my observation, there's nothing at all that requires that. This is certainly not the only time that I've made effective use of mixing/toggling types during reduction.
In your example, you have somehow mixed an expected boolean result with the item type of the array.
If by "expected boolean result" you mean what filter(..)
expects to receive, actually it doesn't require a strict boolean. It expects a truthy/falsy value (check the spec). I like coercion. I use it liberally. The values in my inner
arrays were all truthy values (see below) and filter(..)
works perfectly fine receiving such.
This leads to several bug in your implementation, which doesn't work
None of those are "bugs" in my implementation, because none of those can happen within the constraints of the problem. If you re-read the stated setup for the problem I was solving, you'll see the constraints I'm referring to.
BTW, since you brought it up, for the empty inner
array case to be supported (I didn't need it, but...), all I would need to do is inner.reduce( function.., undefined )
(or false
if you prefer) if I wanted empty arrays filtered out, or inner.reduce( function.., true )
if I wanted empty arrays preserved. Easy.
all operations that return an absorbing element...would benefit
My false
value trigger on finding an inner
that should be filtered out is conceptually that. From then on in the reduction, all other values are "absorbed" (aka ignored, aka overriden) by the false
. :)
On Thu, Mar 26, 2015 at 8:22 PM, Kyle Simpson <getify at gmail.com> wrote:
outer = outer.filter(function filterer(inner){ return inner.reduce(function reducer(prev,current){ if (prev === false || prev === current) return false; return current; }); });
I think you could write that like this:
outer = outer.filter(arr =>
!arr.some((e, i) =>
i > 0 && arr[i-1] === e));
Array.prototype.some is designed to produce a boolean result and already has early exit built in.
I think you could write that like this:
outer = outer.filter(arr => !arr.some((e, i) => i > 0 && arr[i-1] === e));
Yes, you are of course correct. What I was doing in the originally cited code was illustrating using how reduce(..)
by its nature supports the adjacency check, instead of using indexes and manual i-1
type logic.
IOW, I initially wanted to avoid the ugly i-1
, and I traded that for the unfortunate lack of early exit necessitating the equally ugly prev === false
. :/
I’d like to propose an addition to ES7 to add @@reduced as a way to short-circuit from Array.prototype.reduce.
I’ve written up a polyfill and explanation here: leebyron/ecma-reduced
I would love your feedback on this.