That First Next Argument

# Kevin Smith (10 years ago)

Background:

esdiscuss.org/topic/next-yo-in-newborn-generators, esdiscuss.org/topic/april-8-2014-meeting-notes

It appears that the current state of affairs is that the argument supplied to the first call of next on a newborn generator is "ignored and inaccessibe".

Clearly, this means that there are some iterators which cannot be expressed as a generator (namely, any iterator that echoes back it's next arguments). It seems like there should be parity here.

More concretely, the fact that information can be passed into generators means that they can be used to create data sinks. Since that first input is inaccessible, however, this use case is made more awkward than it needs to be; the consumer has to artificially "pump" the generator to get past that first (useless) next.

Is there any way that the generator function can have access to that lost data?

# Claude Pache (10 years ago)

This can be worked around. Basically, ask the generator to advance to the first yield at instantiation, and retrieve the value of the "first" next() with that yield. For example:

// the echo generator, dropping the first .next()
function* echo() {
    var x
    while (true) {
        x = yield x
    }
}    

var iter = echo()
iter.next(3) // {value: undefined, done: false}
iter.next(8) // {value: 8, done: false}
iter.next(1) // {value: 1, done: false}


// the same, advancing to the first `yield` at instantiation
class echo2 extends echo {
    construct(...args) {
        let iter = super(...args)
        iter.next()
        return iter
    }
}

var iter = echo2()
iter.next(3) // {value: 3, done: false}
iter.next(8) // {value: 8, done: false}
iter.next(1) // {value: 1, done: false}
# Kevin Smith (10 years ago)
// the same, advancing to the first `yield` at instantiation
class echo2 extends echo {
    construct(...args) {
        let iter = super(...args)
        iter.next()
        return iter
    }
}

Nice pattern! Would this also work?

var skipFirst = genFn => function*(...args) {
    var iter = genFn(...args);
    iter.next();
    yield * iter;
};

var echo2 = skipFirst(echo);

If we have decorators, then we can write:

@skipFirst
function echo() { /*_*/ }

which is fairly pleasant.

Still, it seems like we're papering over a hole. In principle, why shouldn't we be able to access the first next argument?

# Kevin Smith (10 years ago)

Originally this came up while studying these slides on async generators:

docs.google.com/file/d/0B4PVbLpUIdzoMDR5dWstRllXblU/edit

In this design, Observable.observe is specified as taking a generator which acts as a data sink. However, because of the "lost next" issue, it appears that newborn generators must be "pumped" before they can be used by that function.

In the following gist, I've implemented readFile and writeFile using async generators. The async/await stuff isn't crucial to the point. I've implemented the functions using both x = yield y and a purely hypothetical next keyword to pull data from the consumer.

gist.github.com/zenparsing/26b200543bb8ae0ca4df

(BTW, I'm not asking for any changes - this is just a theoretical question.)

# Claude Pache (10 years ago)

Le 19 août 2014 à 15:21, Kevin Smith <zenparsing at gmail.com> a écrit :

Nice pattern! Would this also work?

Sadly, it doesn't work; for the delegated iterator always receives undefined for its first iteration. :-( See 1, last algorithm of the section, step 5.

# Kevin Smith (10 years ago)

Ah, thanks.

In your class-based solution, is "construct" supposed to be "constructor"?

# Claude Pache (10 years ago)

Yes, sure. I was negatively influenced by some language where the constructor is named "__construct".

# Andy Wingo (10 years ago)

On Tue 19 Aug 2014 08:48, Claude Pache <claude.pache at gmail.com> writes:

This can be worked around. Basically, ask the generator to advance to the first yield at instantiation, and retrieve the value of the "first" next() with that yield. For example:

This has the disadvantage of starting computation in the generator, of course, before it has been asked for.

While this workaround might serve the purpose of "observables", for lazy sequences it's not quite right. It effectively turns an "even" sequence into an "odd" one (in the sense of Wadler's paper "How to add laziness to a strict language, without even being odd"), as for iterables you would have to stash the first return value away somewhere, so you end up with iterators being a pair whose car is strict and whose cdr is lazy.

# Kevin Smith (10 years ago)

This has the disadvantage of starting computation in the generator, of course, before it has been asked for.

Thanks Andy. Creating data sinks (or any general mapping, such as echo) really are awkward using the current x = yield y model.

The most natural solution I was able to come up with uses a closure variable to capture the argument to next (and throw and return) and make it available to the generator:

gist.github.com/zenparsing/26b200543bb8ae0ca4df#file-async-io-4-js

This makes writing the "echo" iterator much more pleasant:

const echo = DataSink(input => function*() {
    while (true) yield input.value;
});

I'm still curious why we need to go through such exercises, though. It seems clear to me that this is a weakness of the current design, and would be easily addressed with syntax. Is there a back-story that I'm not aware of?

# Andy Wingo (10 years ago)

On Wed 20 Aug 2014 16:41, Kevin Smith <zenparsing at gmail.com> writes:

I'm still curious why we need to go through such exercises, though. It seems clear to me that this is a weakness of the current design, and would be easily addressed with syntax. Is there a back-story that I'm not aware of?

No backstory that I'm aware of -- only something that doesn't really fall out from the generators design. There's just no sensible name you could give the value (without getting "creative" with lexical scope), and no continuation waiting to receive it.

# Brendan Eich (10 years ago)

Right. Anyone know whether this has come up as a PEP or suggestion on python-dev?

The meeting notes Kevin cited in the thread root don't mention it, but IIRC we did briefly talk about syntax that could be added in a future edition (AKA next year in a spec, next month in a browser implementation) for receiving that first-next value:

function* gen(a, b, c) first {
   ...
}

Not bikeshedding, some found it ugly or too terse, many wondered about other future syntax vying to go after the parameter list. But the idea seems good.

# Axel Rauschmayer (10 years ago)

Isn’t the problem that a generator function contains the code for both generator object creation and generator behavior? Normally, this is convenient, because both parts can share an environment. But for Kevin’s use case, it becomes a problem.

One possibility may be to implement this as a tool class, separating the concerns “construction” and “behavior”:

class MyGenerator extends CustomGenerator {
    constructor(/* args for generator object creation */) {
        super();
        ...
    }
    * behavior(firstArgOfNext) {
        ...
    }
}
# Kevin Smith (10 years ago)
function* gen(a, b, c) first {
  ...
}

Not bikeshedding, some found it ugly or too terse, many wondered about other future syntax vying to go after the parameter list. But the idea seems good.

Cool. Although this solution isn't going to be that much better, because it will necessitate uncomfortable branching to deal with first iteration vs. subsequent iterations.

On the other hand, something along the lines of:

function *echo() input {
    while (true) yield input.value;
}

doesn't have the same problem.

# Brendan Eich (10 years ago)

So input is bound to the next() actual parameter value on each resumption. That's not bad shed-coloring!

Andy, Dave: WDYT?

# Andy Wingo (10 years ago)

I think changing things right now is the wrong thing to do. Since this is a compatible extension, we don't have to think about this for ES6. Just putting that out there ;)

That said, first thoughts:

  • I assume "input" could then be captured by nested scopes. This would be a bit strange but could work.

  • Really you'd want to be able to destructure too, as you can in every other part of the grammar that introduces names. That's probably not a great thing to do here, though.

  • I assume Kevin meant:

    function *echo input { while (true) yield input }
    

To me it smells, but perhaps that is just my grumpiness ;)

Explicitly being able to place the initial yield would be nicer, but then it's more difficult to ensure that the initial yield comes before other yields.

I say punt ;)

# Allen Wirfs-Brock (10 years ago)

I've been thinking about ways to explicitly place the initial yield. One ideal is a reserved keyword sequence starting with yield, such as yield continue to mark the initial yield point. If a generator function does not contain an explicit yield continue, then it is implicit at the start of the function body.

I don't think it is necessary to statically ensure that yield continue comes before any other yield. It would simply be a runtime detected error if it does not.

# Kevin Smith (10 years ago)

So when the generator function is executed (and before next has been called), it will run until the yield continue (or somesuch), instead of stopping at the start of the function body?

That would be great, actually.

# Brendan Eich (10 years ago)

Andy Wingo wrote:

I think changing things right now is the wrong thing to do. Since this is a compatible extension, we don't have to think about this for ES6. Just putting that out there;)

Oh definitely -- es-discuss and twitter folks need to assume ES6 is done. Train model starting, annual editions if we do it right. Any compatible extension can catch next year's train. This means implementation at the agreed upon stage 2, usually much sooner than next year. See

docs.google.com/document/d/1QbEE0BsO4lvl7NFTn5WXWeiEIBfaVUF7Dk0hpPpPDzU/edit

(Allen: is this the canonical location?)

That said, first thoughts:

  • I assume "input" could then be captured by nested scopes. This would be a bit strange but could work.

It should act like a strict parameter binding (no arguments[arguments.length] alias!).

  • Really you'd want to be able to destructure too, as you can in every other part of the grammar that introduces names. That's probably not a great thing to do here, though.

That would require bracketing, or at least a connecting punctuator. Could be done.

  • I assume Kevin meant:

    function *echo input { while (true) yield input }
    

To me it smells, but perhaps that is just my grumpiness;)

(Grumpy enough not to cuddle * with function :-P.)

Why no () after echo for the empty parameter list? That's a surprise.

Yeah, no .value needed (Kevin can confirm).

Explicitly being able to place the initial yield would be nicer, but then it's more difficult to ensure that the initial yield comes before other yields.

Right.

I say punt;)

For ES6, of course. Seems fair es-discuss fodder for future Harmony inclusion.

# Brendan Eich (10 years ago)

Kevin Smith wrote:

So when the generator function is executed (and before next has been called), it will run until the yield continue (or somesuch), instead of stopping at the start of the function body?

That would be great, actually.

Except it's an oxymoron: the continue does not mean no-pause-at-this-yield, and the proposal abuses the keyword otherwise used only with loops.

Also, write your echo generator this way. You have to duplicate code.

# Allen Wirfs-Brock (10 years ago)

On Aug 21, 2014, at 9:54 AM, Brendan Eich wrote:

docs.google.com/document/d/1QbEE0BsO4lvl7NFTn5WXWeiEIBfaVUF7Dk0hpPpPDzU/edit

(Allen: is this the canonical location?)

See tc39/ecma262

the plan is to setup separate sub-repos for proposals starting at stage . They will have their own issue trackers, etc.

# Kevin Smith (10 years ago)

Also, write your echo generator this way. You have to duplicate code.

Ah, of course - thanks for reminding me.

# Claude Pache (10 years ago)

Le 20 août 2014 à 10:42, Andy Wingo <wingo at igalia.com> a écrit :

This has the disadvantage of starting computation in the generator, of course, before it has been asked for.

Indeed. Here is a version that don't start computation early.

const ignoreFirst = genFn => class extends genFn {
    next(x) {
        delete this.next // next calls to `next` will use directly the super method
        super()
        return super(x)
    }
}

var echo = ignoreFirst(function* echo() {
    var x
    while (true) {
        x = yield x
    }
})

var iter = new echo // note: `new` is needed for correct subclassing!
iter.next(4) // {value: 4, done: false}
iter.next(1) // {value: 1, done: false}
iter.next(6) // {value: 6, done: false}
# Allen Wirfs-Brock (10 years ago)

On Aug 21, 2014, at 10:42 AM, Allen Wirfs-Brock wrote:

the plan is to setup separate sub-repos for proposals starting at stage . They will have their own issue trackers, etc.

oops, that should be "stage 1" in the sentence above.

# Kevin Smith (10 years ago)

Also, write your echo generator this way. You have to duplicate code.

Ah, of course - thanks for reminding me.

Given something like:

function *echo() input { while (true) yield input }

It seems like what we're trying to do is to emulate a contextual keyword using a weird "outside of the parameter list" parameter. I suppose one ideal future-proof way forward would be to reserve an additional word inside of generators for this potential use in ES7+.

Of course, everyone would have to agree on what that reserved word should be...

# Allen Wirfs-Brock (10 years ago)

On Aug 22, 2014, at 7:22 AM, Kevin Smith wrote:

Of course, everyone would have to agree on what that reserved word should be...

and it could only be reserved in strict mode code.

# Kevin Smith (10 years ago)

and it could only be reserved in strict mode code.

I feel like I'm forgetting something obvious, but why? We already use a parameterized grammar (not parameterized on strictness) for "yield". I would think you'd just be adding to that parameterization.

# Claude Pache (10 years ago)

Another option is to use some creative syntax involving the keyword yield: today, we have already yield and yield* which mean two different things; one could for example say that (yield?) would retrieve the last received value without pausing the generator.

# Allen Wirfs-Brock (10 years ago)

On Aug 22, 2014, at 9:27 AM, Kevin Smith wrote:

I feel like I'm forgetting something obvious, but why? We already use a parameterized grammar (not parameterized on strictness) for "yield". I would think you'd just be adding to that parameterization.

I think you'reright. The fact that yield is reserved in ES5 strict mode is only relevant for non-generator functions. We could (now, but not latter) reserve another identifier contextually within generator functions. We just did something similar with await within modules.

# Kevin Smith (10 years ago)

I think you'reright. The fact that yield is reserved in ES5 strict mode is only relevant for non-generator functions. We could (now, but not latter) reserve another identifier contextually within generator functions. We just did something similar with await within modules.

So just hypothetically speaking, what would be a good choice for reserved word? I had input, but that seems like a too-common identifier.

# Brendan Eich (10 years ago)

I think you're rushing things a bit. We don't necessarily want a single name reserved in context. Perhaps there's a better way that doesn't abuse yield or add an unwanted degree of freedom (or three -- don't want to get last yielded-value, that is a GC-leak honey trap; don't want yield-that-must-come-first-temporally-and-be-seen-by-code-reviewers either :-P).

# Kevin Smith (10 years ago)

I think you're rushing things a bit.

That's how I roll when working at the coffee shop! : P

don't want to get last yielded-value, that is a GC-leak honey trap;

Can you elaborate?

# Brendan Eich (10 years ago)

Claude suggested (yield?) or some such to get the last value passed via .next to the now running generator. This would not pause execution. It could be used to get that first-next value (this thread's subject) but it could also be used unwisely all over, and it would require the implementation to keep the last value passed to .next in a pigeon-hole per running generator. Hence GC-leak honey trap.

# Kevin Smith (10 years ago)

Thinking this through... If it's some kind of keyword, then you statically know if and where the current input value is accessed. If it's not used at all, then you don't need to store the value at all. You might also be able to early-release it based on the location of its last appearance before the next yield. Maybe?

# Brendan Eich (10 years ago)

Static analysis is an approximation, skirting the halting problem