Generator use-cases

# Brendan Eich (17 years ago)

This is a long-ish case for including generators in ES4 as proposed.
I offered, to several Ecma colleagues, to mail pointers to examples
of how useful the Python-inspired generators in JS1.7 and JS1.8 in
Firefox, and proposed for inclusion in the ES4 standard, are in real- world code. But I figured that folks on es4-discuss at mozilla.org might
like to see these few examples too.

I ported Peter Norvig's Sudoku solver:

norvig.com/sudoku.html

from Python to JS1.8:

bugzilla.mozilla.org/attachment.cgi?id=266577

This code uses not only generators and array comprehensions (e.g. the
cross function), but also generator expressions -- which are sugar
for generator functions immediately applied, and therefore lazy,
unlike array comprehensions. This laziness is important to avoid
using exponential amounts of memory.

My Mozilla colleague Igor Bukanov rewrote this code in more
straightforward JS-functional-programming style in JS1.8 (so using
expression closures, e.g. function add(x,y) x + y; but not
generators) -- see here:

bugzilla.mozilla.org/attachment.cgi?id=266938

Opinions vary on which version is better, but the generator-based one
is significantly shorter, and also faster in SpiderMonkey. And the
main thing is that it lets the code focus on the essentials of the
search algorithm and minimize the bookkeeping, which Peter's Python
code did very well (Python has lighter syntax, unburdened by the C
heritage, but JS can't disown curly braces and parens; other than
that the JS and Python versions are close).

Another example is a static analysis script (one of many by Dave
Mandelin) for Mozilla's "TreeHydra" GCC plugin (developed by Taras
Glek and Dave):

bugzilla.mozilla.org/attachment.cgi?id=311698

Notice the yield usage, and also the array comprehensions returned,
e.g., by rectify_attribute_args.

Rewriting these to use iterators, or ES3-style functional
programming, adds a lot of source boilerplate that again obscures the
essential code, and tends to perform not as well to boot. To pick one
example, here is rewrite from the flatten_chain generator:

function flatten_chain(chain_head) { for (let o = chain_head; o; o = TREE_CHAIN(o)) { yield o; } }

to this roughly equivalent iterator coded using pure ES3:

function flatten_chain(chain_head) { return { next: function () { var o = chain_head; if (o) { chain_head = TREE_CHAIN(chain_head); return o; } throw StopIteration; } } }

Note that this rewrite loses integrity since next is mutable, which
ES3 can't control (the ES4 methods, also in JS1.7 and higher versions
in Firefox 2-3, are DontDelete and ReadOnly). And of course it leaves
out the rest of the generator suite, send/throw/close, which come for
free in the ES4 Generator class instantiated by a call to a function
containing yield.

Here is the expansion of that array-comprehension-returning
flatten_chain caller, rectify_attribute_args, that I mentioned above,
from:

function rectify_attribute_args(tree) { return [ TREE_STRING_POINTER1(TREE_VALUE(a)) for (a in
flatten_chain(tree)) ]; }

to this ES3 code:

function rectify_attribute_args(tree) { var r = []; var i = flatten_chain(tree); for (;;) { var a; try { a = i.next(); } catch (e if e instanceof StopIteration) { break; } r.push(TREE_STRING_POINTER1(TREE_VALUE(a))); } return r; }

Again I omitted the finally clause to call i.close and other bits of
the general generator mechanism. Of course one could specialize the
termination technique and other details to re-optimize, but why
should this be necessary?

The Python-based syntax is subject to criticism for changing the
meaning of a function once yield is used in the body of the function,
but we are hitching wagons to Python and reusing community brain- print and design experience (also giving feedback to simplify future
versions of Python based on our experience, specifically by
eliminating the GeneratorExit exception).

And as far as I know from the experience in Firefox 2 and 3, we've
had no problems with the potential confusion caused by this extension
to function syntax -- it has been painless.

The ES4 iteration protocol is proposed here:

doku.php? id=proposals:iterators_and_generators

The iteration protocol underlies for-in and for-each-in constructs in
loops and comprehensions. It hides iterator-specific implementation
details such as StopIteration, while providing uniform looping syntax
that can be customized to improve (I would say restore) the utility
of the built-in JS for-in syntax.

Generators, besides supporting one level of coroutine suspending and
(re-)calling, are the cheapest way to implement an iterator. Unlike
general coroutines, they do not break functional abstraction by
jumping over multiple (possibly native) stack frames, which would
have to be saved, and then restoring the saved stack frames later.
They're simple, for better and for worse.

In my opinion, and to borrow from Tallyrand (or whoever said it
first), if ES4 has iterators but not generators, it would be
something worse than a crime -- it would be a mistake.

# Kris Zyp (17 years ago)

Generators, besides supporting one level of coroutine suspending and (re-)calling, are the cheapest way to implement an iterator.

I think Neil's inspiring demonstration of pseudo-threading with generators is also worthy of inclusion in your list of generator use cases: www.neilmix.com/2007/02/07/threading-in-javascript-17 It is quite interesting in that single level coroutines can be combined to create multi level coroutines in a form that provides explicit control of creating composable flow or maintaining atomicity. Generators are a pretty powerful and valuable construct, IMO. I certainly hope they stay in ES4 as well. Kris

# Igor Bukanov (17 years ago)

On 29/03/2008, Kris Zyp <kris at sitepen.com> wrote:

I think Neil's inspiring demonstration of pseudo-threading with generators is also worthy of inclusion in your list of generator use cases: www.neilmix.com/2007/02/07/threading-in-javascript-17

That code can be written without generators. In general whenever the code in the examples from the blog does yield, one can replace that with "return function() { the rest of code from the function}". But this would require to replace imperative loops from the examples by recursive functions.

This shows that if one programs in a functional style, then generators are not that useful. But they are valuable if the code uses explicit loops etc. Plus with generators one can assume certain time and space complexity bounds in ES4 which is not the case for functional code in the view of deferred tail call proposal.

, Igor

# Brendan Eich (17 years ago)

On Mar 29, 2008, at 7:40 AM, Igor Bukanov wrote:

On 29/03/2008, Kris Zyp <kris at sitepen.com> wrote:

I think Neil's inspiring demonstration of pseudo-threading with
generators is also worthy of inclusion in your list of generator use cases: www.neilmix.com/2007/02/07/threading-in-javascript-17

That code can be written without generators. In general whenever the code in the examples from the blog does yield, one can replace that with "return function() { the rest of code from the function}". But this would require to replace imperative loops from the examples by recursive functions.

This particular continuation-passing style would also require proper
tail calls

proposals:proper_tail_calls

to be normative, but (news flash) proper tail calls are out of ES4 as
of yesterday's Ecma TC39 meeting, by general (regretful, in Mozilla's
case) agreement.

This shows that if one programs in a functional style, then generators are not that useful. But they are valuable if the code uses explicit loops etc. Plus with generators one can assume certain time and space complexity bounds in ES4 which is not the case for functional code in the view of deferred tail call proposal.

I'm glad you mentioned tail calls. I'd word it more strongly, as
above: tail calls are required if CPS is the preferred style to use
in lieu of generators. Of course my mail showed a non-CPS expansion
of generators into functions and objects, which does not require tail
calls. But it's verbose, sub-optimal for implementations, and error
prone compared to generators.

# Kris Zyp (17 years ago)

I think Neil's inspiring demonstration of pseudo-threading with generators is also worthy of inclusion in your list of generator use cases: www.neilmix.com/2007/02/07/threading-in-javascript-17

That code can be written without generators. In general whenever the code in the examples from the blog does yield, one can replace that with "return function() { the rest of code from the function}". But this would require to replace imperative loops from the examples by recursive functions.

Of course, but with any flow control (loops, branches, etc.), CPS in JS can quickly turn into a ugly and painful exercise. Generators don't provide any fundamentally higher level of computability than lambda gave us, but they are powerful construct with value. My point was simply to add to the list of use cases for generators and throw in my vote for them. I didn't mean to detract from Brendan's list. Thanks, Kris

# Jason Orendorff (17 years ago)

On Fri, Mar 28, 2008 at 10:32 PM, Brendan Eich <brendan at mozilla.org> wrote:

This is a long-ish case for including generators in ES4 as proposed. [...]

Here's a brief case: generators let you factor a complex loop, dividing the value-producing part from the value-consuming part. Neither part has to be transformed in a non-obvious way. With iterators alone, you do this by rewriting the value-producing part as an Iterator class. This obfuscates the iterator implementation.

It's hardly news that abstracting from sequences is useful. Various languages have had iterators, streams, and lazy lists as core, essential features (with syntactic support) for decades now. For stateful languages, like Python and C#, iterators and generators fill this gap better than anything else I've seen.

Adobe's position paper commented about generators not being that useful, because they're not "deep", like full coroutines. Well, some people get excited about the coroutine-like uses for generators. For me, generators are just the language support you need to use iterators in your code as the powerful abstraction they ought to be.

# Brendan Eich (17 years ago)

On Mar 30, 2008, at 10:58 AM, Jason Orendorff wrote:

On Fri, Mar 28, 2008 at 10:32 PM, Brendan Eich
<brendan at mozilla.org> wrote:

This is a long-ish case for including generators in ES4 as proposed. [...]

Here's a brief case: generators let you factor a complex loop, dividing the value-producing part from the value-consuming part. Neither part has to be transformed in a non-obvious way. With iterators alone, you do this by rewriting the value-producing part as an Iterator class. This obfuscates the iterator implementation.

Thanks, this is nice and short. I wanted to give example links and
excerpts for those new to generators, but what you wrote here is a
great summary and intro.

BTW, no single nominal Iterator class or interface is required, of
course -- all you need is an object 'like IteratorType'.

It's hardly news that abstracting from sequences is useful. Various languages have had iterators, streams, and lazy lists as core, essential features (with syntactic support) for decades now. For stateful languages, like Python and C#, iterators and generators fill this gap better than anything else I've seen.

Agreed.

Adobe's position paper commented about generators not being that useful, because they're not "deep", like full coroutines.

Not to worry, Adobe is on board:

spreadsheets.google.com/pub?key=pFIHldY_CkszsFxMkQOReAQ&gid=2

(I hope this is publicly readable -- it should be).

# Neil Mix (17 years ago)

On Mar 30, 2008, at 12:58 PM, Jason Orendorff wrote:

Here's a brief case: generators let you factor a complex loop, dividing the value-producing part from the value-consuming part. Neither part has to be transformed in a non-obvious way. With iterators alone, you do this by rewriting the value-producing part as an Iterator class. This obfuscates the iterator implementation.

Here's a concrete example of this: fetching and aggregating rows from
a database. You want to fetch rows from large queries in bulk (say
10000 at a time) for perf reasons, but you want to hide that from your
consumer which wants to see things one row at a time.

I recently wrote some python code that aggregated and integrated two
data sources where performance was critical. I had two SQL queries
ordered by id (multiple rows per id). Needed to aggregate each query
by id and merge the results when the same id was in both results.
Datasets were way to large to fit in-mem, and it was too slow to let
the DB do the work spooling to disk. So I created a generator for
each query that fetched forward in bulk and aggregated by id, and a
third generator that pulled records from the two query generators and
merged as necessary. The top-level code was then able to iterate
simply one record at a time by pulling from a generator, oblivious to
all the buffering, aggregating, and merging going on beneath. Elegant
and easy. Had I not had generators at my disposal I might still be
writing that script. ;)

Krys mentioned the "advanced" (some would call it "abusive") mock- coroutining on my blog. Setting that aside as perhaps an over- indulgence in generator goodness, I'm a big fan of generators for very
practical use-cases like the one I mentioned above. Especially for a
language that lacks blocking IO facilities in most implementations,
plus no concurrency, I'm certain generators would find a host of real- world practical uses. It would be a shame if they didn't make it into
ES4.