Questions/issues regarding generators

# Andreas Rossberg (12 years ago)

We have started investigating the implementation of generators in V8, and a couple of questions popped up that are not quite clear from the proposal (and not yet in the draft spec, AFAICS):

  1. Are the methods of a generator object installed as frozen properties? (I hope so, otherwise it would be more difficult to aggressively optimise generators.)

  2. Is yield* supposed to allow arguments that are not native generator objects?

  3. What happens if a generator function terminates with an exception? According to the proposal, nothing special. That implies that the generator is not closed. What happens when it is resumed afterwards? Moreover, is a StopIteration exception handled specially in this context?

  4. Nit: can we perhaps rename the generator "send" method to "resume"? That is so much more intuitive and suggestive, Python precedence notwithstanding. :)

Apart from these questions, we also see a couple of issues with some aspects of the proposal. My apologies if the specific points below have already been made in earlier discussions (I could not find any mention).

  • The generator/iterable/iterator separation is somewhat incoherent. In particular, it makes no sense that it is a suitable implementation of an .iterator method to just return 'this', as it does for generators. The implicit contract of the .iterator method should be that it returns a fresh iterator, otherwise many abstractions over iterables can't reliably work. As a simple example, consider:

    // zip : (iterable, iterable) -> iterable function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] try { while (true) result.push([it1.next(), it2.next()]) } catch(e) { if (isStopIteration(e)) return result throw e } }

You would expect that for any pair of iterables, zip creates an array that pairs the values of both. But is a generator object a proper iterable? No. It has an .iterator method alright, but it does not meet the aforementioned contract! Consider:

let rangeAsArray = [1, 2, 3, 4] let dup = zip(rangeAsArray, rangeAsArray) // [[1,1], [2,2], [3,3], [4,4]]

and contrast with:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

Although a generator supposedly is an iterable, the second zip will fail to produce the desired result, and returns garbage instead.

The problem boils down to the question whether a generator function should return an iterable or an iterator. The current semantics (inherited from Python) tries to side-step the question by answering: "um, both". But as the example demonstrates, that is not a coherent answer.

The only way to fix this seems to be the following: a call to a generator function should NOT return a generator object directly. Rather, it returns a simple iterable, whose iterator method then constructs an actual generator object -- and multiple calls construct multiple objects. In the common case of the for-of loop, VMs should have no problem optimising away the intermediate object. In the remaining cases, where the result of a generator function is used in a first-class manner, the object actually ensures the right semantics.

  • Finally, at the risk of annoying Brendan ;), I think we should (again) revisit the decision to use an exception to mark end-of-iteration. Besides the usual reservations and the problems already discussed in earlier threads, it has some rather ugly implications that I cannot remember being mentioned before:

    • It allows a function called from a generator to fake a regular "return" from its caller (i.e. the generator):

      function f() { throw StopIteration }

      function* g() { ... f(); ... }

      That's a bug, not a feature. Also, the proposal does not say what this does to the generator state (see Q3 above).

    • Worse, the semantics as given in the proposal allows aborting a generator's own return. Not only that, doing this can actually revive a generator that just got closed:

      function*() { ... try { return; // closes the generator } catch(e) { yield 5; // succeeds! } ... // generation can continue regularly after this point

      There can hardly be a question that such a state transition from 'closed' back to 'suspended' should not be possible.

    • Old news: exceptions make it harder to optimise generators, especially because the compiler cannot generally know all quasi-regular return points (see above).

In summary, a return statement does not necessarily cause returning, and returning is not necessarily caused by a return statement. That drives the whole notion of the return statement ad absurdum, I think (besides being a pain to implement). The specific points above can probably be fixed by throwing extra language into the spec, but I think it should rather be taken as proof that using exceptions are a questionable path (with potentially more anomalies down the road).

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making yet another proposal for a two-method protocol. Instead I propose simply delivering a sentinel object as end-of-iteration marker instead of throwing one. The zip function above would then be written as:

function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] while (true) { let x1 = it1.next(), x2 = it2.next() if (isStopIteration(x1) || isStopIteration(x2)) return result result.push([x1, x2]) } }

AFAICS, this option maintains the advantages of the current approach while being much more well-behaved, and we can perfectly well keep using a StopIteration constructor as in the current proposal. (I fully expect that this option has been discussed before, but I couldn't find any related discussion.)

# Claus Reinke (12 years ago)

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making yet another proposal for a two-method protocol. Instead I propose simply delivering a sentinel object as end-of-iteration marker instead of throwing one. The zip function above would then be written as:

function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] while (true) { let x1 = it1.next(), x2 = it2.next() if (isStopIteration(x1) || isStopIteration(x2)) return result result.push([x1, x2]) } }

'it.next()' needs to serve two purposes: yielding an arbitrary object or signaling end of iteration. Using exceptions gives a second channel, separate from objects, but now there is potential for confusion with other exceptions (and using exceptions comes with unwanted cost).

So, really, both the arbitrary object channel and the exception channel are already taken and not freely available for iteration end.

How about lifting the result, to separate yielded objects and end iteration signalling?

{ yields: obj }    // iteration yields obj
{} // iteration ends

Then we could use refutable patterns to generate end exceptions

while (true) { let { yields: x1 } = it1.next(), { yields: x2 } = it2.next() // throw if no match ...

or test for end without exceptions

while (true) { let x1 = it1.next(), x2 = it2.next() if ((!x1.yields) || (!x2.yields)) return result result.push([x1.yields, x2.yields]) }

Claus

# Andreas Rossberg (12 years ago)

On 7 March 2013 16:37, Andreas Rossberg <rossberg at google.com> wrote:

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making yet another proposal for a two-method protocol. Instead I propose simply delivering a sentinel object as end-of-iteration marker instead of throwing one.

Forgot to mention one detail: under this approach, it should of course be a runtime error if yield is applied to a value that is a StopIteration object.

# Andreas Rossberg (12 years ago)

On 7 March 2013 17:29, Claus Reinke <claus.reinke at talk21.com> wrote:

How about lifting the result, to separate yielded objects and end iteration signalling?

{ yields: obj } // iteration yields obj {} // iteration ends

Yes, that would be the proper encoding of an Option/Maybe type, which in the abstract is the ideal (the end object might carry a return value, though).

However, I did not propose that because some around here would probably be unhappy about the extra allocation that is required for every iteration element under this approach.

# Kevin Reid (12 years ago)

On Thu, Mar 7, 2013 at 8:39 AM, Andreas Rossberg <rossberg at google.com>wrote:

On 7 March 2013 16:37, Andreas Rossberg <rossberg at google.com> wrote:

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making yet another proposal for a two-method protocol. Instead I propose simply delivering a sentinel object as end-of-iteration marker instead of throwing one.

Forgot to mention one detail: under this approach, it should of course be a runtime error if yield is applied to a value that is a StopIteration object.

Use of a singleton (or not marked for the specific generator) sentinel object has a hazard: the sentinel is then a magic value which cannot be safely processed by library code written to operate on arbitrary values, which happens to use generators in its implementation.

(ECMAScript already has moderately hazardous values, namely objects-as-maps which do not implement the 'standard protocol' of Object.prototype methods, but these are more hazardous than that in that they are not even safe to pass around without operating on them.)

# Andreas Rossberg (12 years ago)

On 7 March 2013 17:50, Kevin Reid <kpreid at google.com> wrote:

Use of a singleton (or not marked for the specific generator) sentinel object has a hazard: the sentinel is then a magic value which cannot be safely processed by library code written to operate on arbitrary values, which happens to use generators in its implementation.

While that is true, it is conceptually no different from using a magic exception value, as under the current proposal. That clobbers use of that value for the exceptional return path in exactly the same way as the proposed alternative does for the regular return path. The only way to avoid both (in a single-function protocol) is the approach Claus mentioned.

# Kevin Reid (12 years ago)

On Thu, Mar 7, 2013 at 8:56 AM, Andreas Rossberg <rossberg at google.com>wrote:

On 7 March 2013 17:50, Kevin Reid <kpreid at google.com> wrote:

Use of a singleton (or not marked for the specific generator) sentinel object has a hazard: the sentinel is then a magic value which cannot be safely processed by library code written to operate on arbitrary values, which happens to use generators in its implementation.

While that is true, it is conceptually no different from using a magic exception value, as under the current proposal. That clobbers use of that value for the exceptional return path in exactly the same way as the proposed alternative does for the regular return path. The only way to avoid both (in a single-function protocol) is the approach Claus mentioned.

There are conventional expectations about the exceptional return path — interpretations independent of the specific code exiting via it — which there are not about the normal return path.

# Claus Reinke (12 years ago)

How about lifting the result, to separate yielded objects and end iteration signalling?

{ yields: obj } // iteration yields obj {} // iteration ends

Yes, that would be the proper encoding of an Option/Maybe type, which in the abstract is the ideal (the end object might carry a return value, though).

So, more of an Either type, which isn't yet easy to match in ES6.

However, I did not propose that because some around here would probably be unhappy about the extra allocation that is required for every iteration element under this approach.

One of the reasons for avoiding exceptions is to enable optimizations, though, and looking through the call, one might be able to avoid the intermediate allocation for the frequently used path (yield), falling back to extra allocation only for the iteration end. Or allocate the wrapper once, then reuse/fill it on each iteration and overwrite it on iteration end.

Not sure how difficult that (inline and match up construct/deconstruct to avoid intermediate allocation) would be for JS engines... but using exceptions would close most doors for optimization, so it might be more costly overall?

Claus

# Allen Wirfs-Brock (12 years ago)

On Mar 7, 2013, at 7:37 AM, Andreas Rossberg wrote:

We have started investigating the implementation of generators in V8, and a couple of questions popped up that are not quite clear from the proposal (and not yet in the draft spec, AFAICS):

  1. Are the methods of a generator object installed as frozen properties? (I hope so, otherwise it would be more difficult to aggressively optimise generators.)

We discussed the factoring of the generator objects at the Nov 27 meeting. rwldrn/tc39-notes/tree/master/es6/2012-11 there is a sketch of the hierarchy we agreed to in in the notes dl.dropbox.com/u/3531958/tc39/generator-diagram-1.jpg

TThe design I presented at the meeting is very close to that final one, answers this question, and is easier to read: meetings:proposed_generator_class_hierarcy_nov_2013.png

Basically every generator function is a "constructor" object and has its own unique associated prototype that supplies the methods for its instances. The "generator methods" aren't own properties of generator instances but instead are prototype properties. Generator instances have whatever private state that is necessary to support the functioning of those inherited methods but otherwise have no own properties.

I can see a plausible argument for saying all generator instances are created as frozen objects as this would prevent somebody dynamically inserting own over-rides of the prototype provided methods.

  1. Is yield* supposed to allow arguments that are not native generator objects?

  2. What happens if a generator function terminates with an exception? According to the proposal, nothing special. That implies that the generator is not closed. What happens when it is resumed afterwards? Moreover, is a StopIteration exception handled specially in this context?

Good, questions. I haven't worked through spec'ing those details yet. Perhaps DHerman or Jason know.

  1. Nit: can we perhaps rename the generator "send" method to "resume"? That is so much more intuitive and suggestive, Python precedence notwithstanding. :)

Apart from these questions, we also see a couple of issues with some aspects of the proposal. My apologies if the specific points below have already been made in earlier discussions (I could not find any mention).

  • The generator/iterable/iterator separation is somewhat incoherent. In particular, it makes no sense that it is a suitable implementation of an .iterator method to just return 'this', as it does for generators. The implicit contract of the .iterator method should be that it returns a fresh iterator, otherwise many abstractions over iterables can't reliably work. As a simple example, consider:

// zip : (iterable, iterable) -> iterable function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] try { while (true) result.push([it1.next(), it2.next()]) } catch(e) { if (isStopIteration(e)) return result throw e } }

You would expect that for any pair of iterables, zip creates an array that pairs the values of both. But is a generator object a proper iterable? No. It has an .iterator method alright, but it does not meet the aforementioned contract! Consider:

let rangeAsArray = [1, 2, 3, 4] let dup = zip(rangeAsArray, rangeAsArray) // [[1,1], [2,2], [3,3], [4,4]]

and contrast with:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

Although a generator supposedly is an iterable, the second zip will fail to produce the desired result, and returns garbage instead.

I'm not sure I convinced by this. An iterator instance represent a single specific iteration. Your second example is really a user bug and should be coded as: let dup = zip(enum(1,4), enum(1,4));

Zip's informal contract should state that if iterators are passed as arguments they need to be distinct objects. If you want to implement it defensively, you can add a check for that pre-condition.

I can see where there might we some utility for a method that tests a iterator instance to determine whether it is in its initial state, or not.

I can also see utility in a "clone" method for some iterators (makes a new iterator over same data but in the initial state), but it the general case we can't assume that this is possible for every iterator.

The problem boils down to the question whether a generator function should return an iterable or an iterator. The current semantics (inherited from Python) tries to side-step the question by answering: "um, both". But as the example demonstrates, that is not a coherent answer.

I think it is pretty clear that the current semantics is that a generator function always returns an iterator. The question is really about the additional protocol that is provided that allows collections and iterators to be used interchangeably.

Perhaps it would be clearer if the @@iterator methods was named "asIterator", but that still wouldn't change the client bug in your example.

The only way to fix this seems to be the following: a call to a generator function should NOT return a generator object directly. Rather, it returns a simple iterable, whose iterator method then constructs an actual generator object -- and multiple calls construct multiple objects. In the common case of the for-of loop, VMs should have no problem optimising away the intermediate object. In the remaining cases, where the result of a generator function is used in a first-class manner, the object actually ensures the right semantics.

Refer aback to the class hierarchy diagram. This would make the overall iterator/generator object model even more complicated. I don't wee enough benefit in doing so.

  • Finally, at the risk of annoying Brendan ;), I think we should (again) revisit the decision to use an exception to mark end-of-iteration. Besides the usual reservations and the problems already discussed in earlier threads, it has some rather ugly implications that I cannot remember being mentioned before: ...
# Brandon Benvie (12 years ago)

On 3/7/2013 7:37 AM, Andreas Rossberg wrote:

We have started investigating the implementation of generators in V8, and a couple of questions popped up that are not quite clear from the proposal (and not yet in the draft spec, AFAICS):

  1. Are the methods of a generator object installed as frozen properties? (I hope so, otherwise it would be more difficult to aggressively optimise generators.)

  2. Is yield* supposed to allow arguments that are not native generator objects?

  3. What happens if a generator function terminates with an exception? According to the proposal, nothing special. That implies that the generator is not closed. What happens when it is resumed afterwards? Moreover, is a StopIteration exception handled specially in this context?

  4. Nit: can we perhaps rename the generator "send" method to "resume"? That is so much more intuitive and suggestive, Python precedence notwithstanding. :)

Apart from these questions, we also see a couple of issues with some aspects of the proposal. My apologies if the specific points below have already been made in earlier discussions (I could not find any mention).

  • The generator/iterable/iterator separation is somewhat incoherent. In particular, it makes no sense that it is a suitable implementation of an .iterator method to just return 'this', as it does for generators. The implicit contract of the .iterator method should be that it returns a fresh iterator, otherwise many abstractions over iterables can't reliably work. As a simple example, consider:

    // zip : (iterable, iterable) -> iterable function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] try { while (true) result.push([it1.next(), it2.next()]) } catch(e) { if (isStopIteration(e)) return result throw e } }

You would expect that for any pair of iterables, zip creates an array that pairs the values of both. But is a generator object a proper iterable? No. It has an .iterator method alright, but it does not meet the aforementioned contract! Consider:

let rangeAsArray = [1, 2, 3, 4] let dup = zip(rangeAsArray, rangeAsArray) // [[1,1], [2,2], [3,3], [4,4]]

and contrast with:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

Although a generator supposedly is an iterable, the second zip will fail to produce the desired result, and returns garbage instead.

The problem boils down to the question whether a generator function should return an iterable or an iterator. The current semantics (inherited from Python) tries to side-step the question by answering: "um, both". But as the example demonstrates, that is not a coherent answer.

The only way to fix this seems to be the following: a call to a generator function should NOT return a generator object directly. Rather, it returns a simple iterable, whose iterator method then constructs an actual generator object -- and multiple calls construct multiple objects. In the common case of the for-of loop, VMs should have no problem optimising away the intermediate object. In the remaining cases, where the result of a generator function is used in a first-class manner, the object actually ensures the right semantics.

  • Finally, at the risk of annoying Brendan ;), I think we should (again) revisit the decision to use an exception to mark end-of-iteration. Besides the usual reservations and the problems already discussed in earlier threads, it has some rather ugly implications that I cannot remember being mentioned before:

    • It allows a function called from a generator to fake a regular "return" from its caller (i.e. the generator):

      function f() { throw StopIteration }

      function* g() { ... f(); ... }

      That's a bug, not a feature. Also, the proposal does not say what this does to the generator state (see Q3 above).

    • Worse, the semantics as given in the proposal allows aborting a generator's own return. Not only that, doing this can actually revive a generator that just got closed:

      function*() { ... try { return; // closes the generator } catch(e) { yield 5; // succeeds! } ... // generation can continue regularly after this point

      There can hardly be a question that such a state transition from 'closed' back to 'suspended' should not be possible.

    • Old news: exceptions make it harder to optimise generators, especially because the compiler cannot generally know all quasi-regular return points (see above).

In summary, a return statement does not necessarily cause returning, and returning is not necessarily caused by a return statement. That drives the whole notion of the return statement ad absurdum, I think (besides being a pain to implement). The specific points above can probably be fixed by throwing extra language into the spec, but I think it should rather be taken as proof that using exceptions are a questionable path (with potentially more anomalies down the road).

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making yet another proposal for a two-method protocol. Instead I propose simply delivering a sentinel object as end-of-iteration marker instead of throwing one. The zip function above would then be written as:

function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] while (true) { let x1 = it1.next(), x2 = it2.next() if (isStopIteration(x1) || isStopIteration(x2)) return result result.push([x1, x2]) } }

AFAICS, this option maintains the advantages of the current approach while being much more well-behaved, and we can perfectly well keep using a StopIteration constructor as in the current proposal. (I fully expect that this option has been discussed before, but I couldn't find any related discussion.)

/Andreas


es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es-discuss

Brendan not Brandon. =D I am impartial due to ignorance on this matter.

# Andreas Rossberg (12 years ago)

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 7, 2013, at 7:37 AM, Andreas Rossberg wrote:

  1. Are the methods of a generator object installed as frozen properties? (I hope so, otherwise it would be more difficult to aggressively optimise generators.)

We discussed the factoring of the generator objects at the Nov 27 meeting. rwldrn/tc39-notes/tree/master/es6/2012-11 there is a sketch of the hierarchy we agreed to in in the notes dl.dropbox.com/u/3531958/tc39/generator-diagram-1.jpg

TThe design I presented at the meeting is very close to that final one, answers this question, and is easier to read: meetings:proposed_generator_class_hierarcy_nov_2013.png

Ah, thanks. It seems that I missed that part of the meeting. I'll have a look.

Me:

  • The generator/iterable/iterator separation is somewhat incoherent. In particular, it makes no sense that it is a suitable implementation of an .iterator method to just return 'this', as it does for generators. The implicit contract of the .iterator method should be that it returns a fresh iterator, otherwise many abstractions over iterables can't reliably work. As a simple example, consider:

// zip : (iterable, iterable) -> iterable function zip(iterable1, iterable2) { let it1 = iterable1.iterator() let it2 = iterable2.iterator() let result = [] try { while (true) result.push([it1.next(), it2.next()]) } catch(e) { if (isStopIteration(e)) return result throw e } }

You would expect that for any pair of iterables, zip creates an array that pairs the values of both. But is a generator object a proper iterable? No. It has an .iterator method alright, but it does not meet the aforementioned contract! Consider:

let rangeAsArray = [1, 2, 3, 4] let dup = zip(rangeAsArray, rangeAsArray) // [[1,1], [2,2], [3,3], [4,4]]

and contrast with:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

Although a generator supposedly is an iterable, the second zip will fail to produce the desired result, and returns garbage instead.

Allen:

I'm not sure I convinced by this. An iterator instance represent a single specific iteration. Your second example is really a user bug and should be coded as: let dup = zip(enum(1,4), enum(1,4));

Zip's informal contract should state that if iterators are passed as arguments they need to be distinct objects. If you want to implement it defensively, you can add a check for that pre-condition.

I have to disagree here. That is just evading the question what the contract for .iterator is. Either it is supposed to create new state or it isn't. It's not a very useful contract to say that it can be both, because then you cannot reliably program against it. And from the desire to support collections as iterables it pretty much follows that it should create fresh state (and under that assumption, there is no bug in zip).

The problem boils down to the question whether a generator function should return an iterable or an iterator. The current semantics (inherited from Python) tries to side-step the question by answering: "um, both". But as the example demonstrates, that is not a coherent answer.

I think it is pretty clear that the current semantics is that a generator function always returns an iterator.

Well, then it should not have an .iterator method, and for-of should note invoke that method. But clearly, that would make it harder to use generators with user-defined iteration abstractions.

But your comment about cloning actually made me think about another solution: you can make a generator object truly both an iterator and an iterable by defining its .iterator method to be a clone method. Importantly, in the normal use case (for-of), that cloning would not actually be observable, and could easily be avoided by an implementation.

# Allen Wirfs-Brock (12 years ago)

On Mar 7, 2013, at 11:05 AM, Andreas Rossberg wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 7, 2013, at 7:37 AM, Andreas Rossberg wrote:

Allen:

I'm not sure I convinced by this. An iterator instance represent a single specific iteration. Your second example is really a user bug and should be coded as: let dup = zip(enum(1,4), enum(1,4));

Zip's informal contract should state that if iterators are passed as arguments they need to be distinct objects. If you want to implement it defensively, you can add a check for that pre-condition.

I have to disagree here. That is just evading the question what the contract for .iterator is. Either it is supposed to create new state or it isn't. It's not a very useful contract to say that it can be both, because then you cannot reliably program against it. And from the desire to support collections as iterables it pretty much follows that it should create fresh state (and under that assumption, there is no bug in zip).

I think the meaning of @@iterator would be clearer if it was named asIterator. obj.asIterator()/obj[@@iterator]]() is a request to provide an iterator which may be either. obj itself or a new object that provides default iteration behavior for obj.

The problem boils down to the question whether a generator function should return an iterable or an iterator. The current semantics (inherited from Python) tries to side-step the question by answering: "um, both". But as the example demonstrates, that is not a coherent answer.

I think it is pretty clear that the current semantics is that a generator function always returns an iterator.

Well, then it should not have an .iterator method, and for-of should note invoke that method. But clearly, that would make it harder to use generators with user-defined iteration abstractions.

again, think of it as asIterator...it is coercion method, not a factory.

But your comment about cloning actually made me think about another solution: you can make a generator object truly both an iterator and an iterable by defining its .iterator method to be a clone method. Importantly, in the normal use case (for-of), that cloning would not actually be observable, and could easily be avoided by an implementation.

Or, define @@terator such that it return this only on its first call and any subsequent @@iterator calls to that object does a clone (if possible). The place this would fall down is when an iterator object is passed as a parameter down through a chain of calls that do a @@iterator coercion at each step of the chain.

Alen

# Erik Arvidsson (12 years ago)

One more thing worth pointing out is that a singleton does not work due to "return e" in generators needs to signal an end to iteration with a value. The wiki does this be creating a new object with a "value" property and [[Class]] set to "StopIteration"

Instead of returning a singleton we can return an object with two fields, {value: v, done: b} and when we see a "return e" in a generator this becomes {value: e, done: true}.

gist.github.com/anonymous/5108939

On Thu, Mar 7, 2013 at 11:50 AM, Kevin Reid <kpreid at google.com> wrote:

On Thu, Mar 7, 2013 at 8:39 AM, Andreas Rossberg <rossberg at google.com> wrote:

On 7 March 2013 16:37, Andreas Rossberg <rossberg at google.com> wrote:

But, in order to (hopefully) let Brandon calm down a bit, I am NOT making yet another proposal for a two-method protocol. Instead I propose simply delivering a sentinel object as end-of-iteration marker instead of throwing one.

Forgot to mention one detail: under this approach, it should of course be a runtime error if yield is applied to a value that is a StopIteration object.

Use of a singleton (or not marked for the specific generator) sentinel object has a hazard: the sentinel is then a magic value which cannot be safely processed by library code written to operate on arbitrary values, which happens to use generators in its implementation.

(ECMAScript already has moderately hazardous values, namely objects-as-maps which do not implement the 'standard protocol' of Object.prototype methods, but these are more hazardous than that in that they are not even safe to pass around without operating on them.)


es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es-discuss

-- erik

# Andreas Rossberg (12 years ago)

On 7 March 2013 22:30, Erik Arvidsson <erik.arvidsson at gmail.com> wrote:

One more thing worth pointing out is that a singleton does not work due to "return e" in generators needs to signal an end to iteration with a value. The wiki does this be creating a new object with a "value" property and [[Class]] set to "StopIteration"

Yes, to clarify, I did not talk about using a singleton, but about a an instance of the StopIteration class, just like in the wiki.

Instead of returning a singleton we can return an object with two fields, {value: v, done: b} and when we see a "return e" in a generator this becomes {value: e, done: true}.

Sure, if there is hope for getting consensus on an Either-type like approach, then I would definitely prefer that over what I suggested!

# Andreas Rossberg (12 years ago)

On 7 March 2013 21:58, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

I think the meaning of @@iterator would be clearer if it was named asIterator. obj.asIterator()/obj[@@iterator]]() is a request to provide an iterator which may be either. obj itself or a new object that provides default iteration behavior for obj.

I don't think that renaming solves anything, you still want a useful contract. (And I would actually have issues with the name asIterator for something that is a factory in all normal cases.)

Or, define @@terator such that it return this only on its first call and any subsequent @@iterator calls to that object does a clone (if possible). The place this would fall down is when an iterator object is passed as a parameter down through a chain of calls that do a @@iterator coercion at each step of the chain.

Hm, what would be the advantage of that? The semantic fragility it implies is even worse than for the status quo.

# Jason Orendorff (12 years ago)

On Thu, Mar 7, 2013 at 1:05 PM, Andreas Rossberg <rossberg at google.com>wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

Zip's informal contract should state that if iterators are passed as arguments they need to be distinct objects. If you want to implement it defensively, you can add a check for that pre-condition.

I have to disagree here. That is just evading the question what the contract for .iterator is. Either it is supposed to create new state or it isn't. It's not a very useful contract to say that it can be both, because then you cannot reliably program against it.

In Python, the contract definitely says that it can be both. It's the only practical choice. For collections, you want new state. But you also want things such as generators, database cursors, and file descriptors to be iterable:

with open(filename, 'r') as f:
    for line in f:
        handle_input(line)

and you definitely don't want new state here, because what would that even mean? A read position is kind of inherent to a file descriptor, right?

When you call zip() in Python, you expect that each argument will be iterated. I mean, it could hardly be otherwise. So if you've got an argument that can only be consumed once (either something like a file, or an arbitrary iterable you don't know about), you don't pass it twice; and you expect each such argument to become useless afterwards, just as if you had used it in a for-loop. That's clear enough to code to reliably in practice. It's not all that different from Unix pipes.

# Tab Atkins Jr. (12 years ago)

On Fri, Mar 8, 2013 at 9:23 AM, Jason Orendorff <jason.orendorff at gmail.com> wrote:

On Thu, Mar 7, 2013 at 1:05 PM, Andreas Rossberg <rossberg at google.com> wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

Zip's informal contract should state that if iterators are passed as arguments they need to be distinct objects. If you want to implement it defensively, you can add a check for that pre-condition.

I have to disagree here. That is just evading the question what the contract for .iterator is. Either it is supposed to create new state or it isn't. It's not a very useful contract to say that it can be both, because then you cannot reliably program against it.

In Python, the contract definitely says that it can be both. It's the only practical choice. For collections, you want new state. But you also want things such as generators, database cursors, and file descriptors to be iterable:

with open(filename, 'r') as f:
    for line in f:
        handle_input(line)

and you definitely don't want new state here, because what would that even mean? A read position is kind of inherent to a file descriptor, right?

When you call zip() in Python, you expect that each argument will be iterated. I mean, it could hardly be otherwise. So if you've got an argument that can only be consumed once (either something like a file, or an arbitrary iterable you don't know about), you don't pass it twice; and you expect each such argument to become useless afterwards, just as if you had used it in a for-loop. That's clear enough to code to reliably in practice. It's not all that different from Unix pipes.

And in Python, the iterator algebra has .tee(), which uses caching to produce multiple copies of a stateful iterator.

# Dmitry Lomov (12 years ago)

Coming back to the original example

let rangeAsArray = [1, 2, 3, 4] let dup = zip(rangeAsArray, rangeAsArray) // [[1,1], [2,2], [3,3], [4,4]]

vs.

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

clearly, as Allen stated, under current proposal the second call is a user error. Under the current proposal, iterables are actually iterators and have internal state. However the consequence of this design is that the user code should never treat iterator() as a factory method, only as a coercion method, and generally assume that it can call iterator() on an object only once.

This semantics is sound and consistent, but there is a problem: by that logic, the first call 'zip(rangeAsArray, rangeAsArray)' also has all the appearances of a user error! It requires careful analysis and thinking to convince oneself that it is indeed correct. Well, maybe not in a simple case when the initializer of rangeAsArray is an array literal, but as soon as the initializer is more complicated - say an external function, you can never be sure.

If we assume this semantics, we generally cannot iterate the collections, such as arrays, more than once. Note that the failure that occurs when the user switches from array function to an generator is really hard to notice

  • the zip function does not break immediately or throw exception, it just produces non-sensical results.

If we change semantics so that iterator() would return a fresh iterator every time it is called, then these problems will be avoided. But what happens with, say, 'open' example? The way to do it would be for 'open' to return an iterable that actually only opens a file when it's iterator() method is called. Therefore, for example, zip operation would work on files, too:

var f = open(filename, 'r') var zippedFile = zip(f, f)

Calls to f.iterator() inside zip would open the file twice and iterate the contents, whereas: for(l in open(filename, 'r')) ...

will continue to work. The read position will be inherent to the iterator (as returned by the iterator() method), not to the iterable that 'open' returns. That iterator can only be consumed once, but iterable can be reused time and again, by calling an iterator() method on it - just like an array. By adopting this approach, user code treating in-memory collections and other generated sequences unifies very nicely.

As another example, consider the 'tee()' operator that Tab proposes. In the iterator-only world, it is unclear what that tee returns. Since it returns an iterator, and iterator can only be iterated once (since in iterator-only world the user have to generally assume that iterator() is a coercion method), the whole notion of 'caching' does not make sense. Now in iterable-and-iterator world, tee would take an iterator (which it would then iterate to the end) and produce an iterable, and that iterable would iterate cached values from an iterator over and over again.

To summarize, while treating 'iterator()' as a coercion method is a consistent choice, it makes operations over collections unnecessarily distinct from operations over generators. Implementing iterator() as a factory method will unify those operations while keeping supporting all other scenarios for iterators.

Kind , Dmitry

P.S. One nice way to unify iterables and iterators from the user perspective is the Andreas' proposal to make an iterator() function return a "clone" of the iterator that is started from the beginning.

# Kevin Gadd (12 years ago)

Another option for scenarios like open() where it is not cheap to create multiple distinct iterators (each starting from the beginning), or for scenarios where it's impossible (like a network stream) would be to only expose an iterator in those instances, not an iterable. Exposing an iterator would make it clear that your only available operations are those valid on an iterator: moving forward and getting values. I guess in scenarios like a network stream you could also just throw when a consumer asks for a second iterator; painful, but probably not the end of the world if you document it.

I should note that the C# compiler's transform for functions that produce iterables can also produce an iterator directly if you ask it to; a feature like that would make this pattern easier but it'd probably also complicate syntax and usage significantly. The iterables that it produces close over the argument values and the iterators contain the actual enumeration state, thus with the way the C# compiler implements iterable functions, your example (zip over the result of rangeAsGenerator twice) would work fine. I can understand if this functionality is viewed as an undesirable complication, but it does feel illogical to me for the iterable produced by a function* to actually just be a single-use iterator under the covers. If that's what it is, it should return an iterator.

# Jason Orendorff (12 years ago)

On Mon, Mar 11, 2013 at 2:48 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:

Another option for scenarios like open() where it is not cheap to create multiple distinct iterators (each starting from the beginning), or for scenarios where it's impossible (like a network stream) would be to only expose an iterator in those instances, not an iterable.

But then how would you use it?

If you could not use a for-of loop on it, or pass it to functions like zip(), that is throwing the baby out with the bathwater.

# Dmitry Lomov (12 years ago)

On Mar 12, 2013 6:56 AM, "Jason Orendorff" <jason.orendorff at gmail.com>

wrote:

On Mon, Mar 11, 2013 at 2:48 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:

Another option for scenarios like open() where it is not cheap to create multiple distinct iterators (each starting from the beginning), or for scenarios where it's impossible (like a network stream) would be to only expose an iterator in those instances, not an iterable.

But then how would you use it?

If you could not use a for-of loop on it, or pass it to functions like

zip(), that is throwing the baby out with the bathwater.

At a risk of repeating myself, 'open()' scenario is handled perfectly well with the iterable (see my example). Example where things truly cannot be reiterated (I am not sure why network stream is such an example - the network connection can always be opened twice) are rare. One possibility will be to throw on the second call to iterator().

# Jason Orendorff (12 years ago)

On Mon, Mar 11, 2013 at 8:57 AM, Dmitry Lomov <dslomov at google.com> wrote:

This semantics is sound and consistent, but there is a problem: by that logic, the first call 'zip(rangeAsArray, rangeAsArray)' also has all the appearances of a user error! It requires careful analysis and thinking to convince oneself that it is indeed correct. Well, maybe not in a simple case when the initializer of rangeAsArray is an array literal, but as soon as the initializer is more complicated - say an external function, you can never be sure.

But you could argue the same way for literally any other operation on an object. 'rangeAsArray.length', for example, would also be nonsensical if rangeAsArray turns out to be some other sort of object and not an array after all.

If we are to presume that this particular kind of bug will be common in JS, why isn't it common in Python?

Note that generators return coroutine objects with methods other than just the iteration-related methods. The coroutine state, to my mind, is inherent to the returned object.

# Jason Orendorff (12 years ago)

On Tue, Mar 12, 2013 at 1:06 AM, Dmitry Lomov <dslomov at google.com> wrote:

At a risk of repeating myself, 'open()' scenario is handled perfectly well with the iterable (see my example). Example where things truly cannot be reiterated (I am not sure why network stream is such an example - the network connection can always be opened twice) are rare. One possibility will be to throw on the second call to iterator().

Gosh, maybe we are irreconcilable then. Implicitly opening network connections many times seems bad to me. Same for reopening files, actually.

If I'm mistaken about Python and this is actually a common problem there, then I'd reconsider.

# Dmitry Lomov (12 years ago)

(I'll address comments from both your e-mails here)

On Tue, Mar 12, 2013 at 7:56 AM, Jason Orendorff <jason.orendorff at gmail.com>wrote:

On Tue, Mar 12, 2013 at 1:06 AM, Dmitry Lomov <dslomov at google.com> wrote:

At a risk of repeating myself, 'open()' scenario is handled perfectly well with the iterable (see my example). Example where things truly cannot be reiterated (I am not sure why network stream is such an example - the network connection can always be opened twice) are rare. One possibility will be to throw on the second call to iterator().

Gosh, maybe we are irreconcilable then. Implicitly opening network connections many times seems bad to me. Same for reopening files, actually.

I do not think we are irreconcilable. Clearly there is a library design choice here. A designer of a particular library for file/network IO may or may not consider opening a file on 'iterator()' call too implicit. I think it is not too implicit, while you appear to think otherwise.

In the world with Iterables, the library designer can easily disallow iterating the result of open a second time - as I suggested above, if for whatever reason the sequence cannot be re-iterated, iterator() method can throw on second call. In that case, attempt to zip a file with itself will throw when zip calls the iterator method a second time, and that will be an early failure with a clear cause.

However, non-reiterable iterables are a fringe case - maybe 10% of iterators are non-re-iterable even by the standards you suggest (expensive operations on iteration start). [I am being generous here; seems that all allegedly non-reiterable examples suggested so far has been related to I/O; given that I/O libraries are generally asynchronous in ES, I/O is generally not very amenable to be implemented as iterators, since in general results of I/O operations are only available in a callback, and not on-demand, as next() method would require]. My educated guess would be that 90% iterators/iterables in the wild would be easily re-iterable, as they would be results of operations over collections (such as filter, map, zip and similar). This is a baby that gets thrown with the water, not the "non-restartable" iterators

This semantics is sound and consistent, but there is a problem: by that logic, the first call 'zip(rangeAsArray, rangeAsArray)' also has all the appearances of a user error! It requires careful analysis and thinking to convince oneself that it is indeed correct. Well, maybe not in a simple case when the initializer of rangeAsArray is an array literal, but as soon as the initializer is more complicated - say an external function, you can never be sure.

But you could argue the same way for literally any other operation on an

object. 'rangeAsArray.length', for example, would also be nonsensical if rangeAsArray turns out to be some other sort of object and not an array after all.

We do not talk here about arbitrary operations on a random object; we are talking about operations mandated by the language and their semantics. In fact, length is not a bad example of a precedent in this space: after ES5 for (int i = 0; i < obj.length; i++) console.log(obj[i]); works great for all "array-like" data structures, including arrays, strings and typed arrays. It will be nice to achieve the same for iterator(), for..of and generators.

Note that generators return coroutine objects with methods other than

just the iteration-related methods. The coroutine state, to my mind, is inherent to the returned object.

In the Iterable design, coroutine state would be inherent to a result of iterator(), i.e. co-routine execution begins once iterator() is called.

If we are to presume that this particular kind of bug will be common in JS, why isn't it common in Python? If I'm mistaken about Python and this is actually a common problem there, then I'd reconsider.

I am not a deep specialist in Python, but my understanding is that the problem there is mitigated by the common practice of writing iterators. Python is class-based, so typically one iterates over the class instance, and implementation of iter looks like:

class MyToDoList: ... def iter(self): for task in self.tasks: if not task.done: yield task ...

What happens here is that MyToDoList is actually Iterable in the sense I advocate: every call to MyToDoList.iter returns a fresh iterator. Since python developers typically wrap their iterators in a class, iterable/iterator dichotomy is not acute (but search for "python iterators vs iterables" and even "python iterators considered harmful" to see some examples of confused users)

I think that in ES, heavy on functions, people will tend to just use "function*() { ... }" way more often than in Python.

Dmitry

# Brendan Eich (12 years ago)

Consider:

 var i = getSomeIterator();
 var x0 = i.next(), x1 = i.next(), x2 = i.next();
 for (let x of i) {
     ...
 }

Should the loop iterate over x0, x1, and x2? That's what would (have to) happen if i@iterator returned a cloneof the iterator ireset to the starting "position".

Of course the iteration protocol we have in Harmony has no notion of position, or memory, or any particular thing that might be needed to replay x0, x1, and x2.

Cloning i at its "current position" (if such a notion exists) has the problem that Andreas objected to in the o.p.

Not cloning i, making iter@iterator === iter as in Python, solves the problem minimally.

I don't see a way to specify iterator cloning as part of the iteration protocol. What am I missing?

# Andreas Rossberg (12 years ago)

On 14 March 2013 22:54, Brendan Eich <brendan at mozilla.com> wrote:

Consider:

var i = getSomeIterator();
var x0 = i.next(), x1 = i.next(), x2 = i.next();
for (let x of i) {
    ...
}

Should the loop iterate over x0, x1, and x2? That's what would (have to) happen if i@iterator returned a cloneof the iterator ireset to the starting "position".

I agree this is an unsatisfactory consequence to the generatorObject.iterator = cloning proposal, which was meant as a kind of have-your-cake-and-eat-it-too compromise. It doesn't really achieve that, so I withdraw it.

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s).

# Brendan Eich (12 years ago)

Andreas Rossberg wrote:

On 14 March 2013 22:54, Brendan Eich<brendan at mozilla.com> wrote:

Consider:

 var i = getSomeIterator();
 var x0 = i.next(), x1 = i.next(), x2 = i.next();
 for (let x of i) {
     ...
 }

Should the loop iterate over x0, x1, and x2? That's what would (have to) happen if i@iterator returned a cloneof the iterator ireset to the starting "position".

I agree this is an unsatisfactory consequence to the generatorObject.iterator = cloning proposal, which was meant as a kind of have-your-cake-and-eat-it-too compromise. It doesn't really achieve that, so I withdraw it.

Thanks.

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s).

That's horribly inconvenient. It takes Dmitry's example:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

which contains a bug under the Harmony proposal, to this:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator@iterator, rangeAsGenerator@iterator)

which while it works, is just silly given JS's mutable object heritage. Programmers will not write this. They will instead write

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let dup = zip(enum(1, 4), enum(1, 4))

which is clearer, shorter, and more truthful and beautiful.

You seem to think iterables are immutable, or something. 'taint so -- JS is not ML! :

# Andreas Rossberg (12 years ago)

On 8 March 2013 18:23, Jason Orendorff <jason.orendorff at gmail.com> wrote:

On Thu, Mar 7, 2013 at 1:05 PM, Andreas Rossberg <rossberg at google.com> wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

Zip's informal contract should state that if iterators are passed as arguments they need to be distinct objects. If you want to implement it defensively, you can add a check for that pre-condition.

I have to disagree here. That is just evading the question what the contract for .iterator is. Either it is supposed to create new state or it isn't. It's not a very useful contract to say that it can be both, because then you cannot reliably program against it.

In Python, the contract definitely says that it can be both. It's the only practical choice. For collections, you want new state. But you also want things such as generators, database cursors, and file descriptors to be iterable:

with open(filename, 'r') as f:
    for line in f:
        handle_input(line)

and you definitely don't want new state here, because what would that even mean? A read position is kind of inherent to a file descriptor, right?

A generator is an abstraction that is intended to be invokable many times. Are you saying that there are generators for which you cannot do that?

All I'm suggesting (in my original proposal) is that iterator creation is always performed in the .iterator method. For generators that means that you can create multiple iterators from one generator application, but that would be no different from what you can do anyway by invoking the same generator with the same arguments multiple times.

The stratification I suggest reconciles generators with a clean contractual interpretation of iterables. Among other things, that allows generators to be used in combination with both abstractions over iterators as well as abstractions over iterables (which are different beasts!). Under the current semantics, that does not really work.

I can see that the suggestion might look like a complication, but I think it is a fairly minor one, and more importantly, in practice will almost always be confined to abstractions.

# Andreas Rossberg (12 years ago)

On 14 March 2013 23:38, Brendan Eich <brendan at mozilla.com> wrote:

Andreas Rossberg wrote:

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s).

That's horribly inconvenient. It takes Dmitry's example:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

which contains a bug under the Harmony proposal, to this:

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator@iterator, rangeAsGenerator@iterator)

No, why? The zip function invokes the iterator method for you.

See my reply to Jason: I think that in most practical cases (in particular, all abstractions over iterables), the invocation of the iterator method will happen inside an abstraction, and the programmer does not have to worry about it.

which while it works, is just silly given JS's mutable object heritage. Programmers will not write this. They will instead write

function* enum(from, to) { for (let i = from; i <= to; ++i) yield i }

let dup = zip(enum(1, 4), enum(1, 4))

which is clearer, shorter, and more truthful and beautiful.

And that is perfectly fine under my proposal. :)

You seem to think iterables are immutable, or something. 'taint so -- JS is not ML! :-P

Not sure what that has to do with anything. 8-}

# Brendan Eich (12 years ago)

Andreas Rossberg wrote:

On 14 March 2013 23:38, Brendan Eich<brendan at mozilla.com> wrote:

Andreas Rossberg wrote:

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s). That's horribly inconvenient. It takes Dmitry's example:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

which contains a bug under the Harmony proposal, to this:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator@iterator, rangeAsGenerator@iterator)

No, why? The zip function invokes the iterator method for you.

Sure, but only if you know that. I thought you were advocating explicit iterator calls.

A call expression cannot be assumed to return a result that can be consumed by some mutating protocol twice, in general. Why should generator functions be special?

I agree they could be special-cased, but doing so requires an extra allocation (the generator-iterable that's returned).

Meanwhile the Pythonic pattern is well-understood, works fine, and (contra Dmitry's speculation) does not depend on class-y OOP in Python.

I guess it's the season of extra allocations, but still: in general when I consume foo() via something that mutates its return value, I do not expect to be able to treat foo() as referentially transparent. Not in JS!

# Brendan Eich (12 years ago)

Andreas Rossberg wrote:

See my reply to Jason: I think that in most practical cases (in particular, all abstractions over iterables), the invocation of the iterator method will happen inside an abstraction, and the programmer does not have to worry about it.

Talking 1:1 with you after the TC39 meeting, it came out that the ES6 spec does not say that iterators are iterables whose @iterator does "return self". That changes things, but still makes a messier contract than you prefer.

The contract you prefer is one where iterables have @iterator and calling it gets a (mutable) iterator that is not an iterable. It would require, as you proposed, making generators return iterables not iterators -- an extra allocation.

At the level of contract cleanliness and usability, that may be better than the Pythonic convention -- I'm not sure. Cc'ing Jason.

At the level of extra allocations, I still say "boo".

# Jason Orendorff (12 years ago)

On Thu, Mar 14, 2013 at 3:48 PM, Andreas Rossberg <rossberg at google.com>wrote:

On 8 March 2013 18:23, Jason Orendorff <jason.orendorff at gmail.com> wrote:

and you definitely don't want new state here, because what would that even mean? A read position is kind of inherent to a file descriptor, right?

A generator is an abstraction that is intended to be invokable many times. Are you saying that there are generators for which you cannot do that?

Eh? No, I'm saying you generally don't want to restart functions automatically and implicitly.

Currently, if you call a generator, you get a coroutine. I think what you're suggesting would instead make generators return a coroutine factory, and have coroutines implicitly created in many situations. That seems like it might be bad to me. Not all generators are as straightforward as enum. They can have side effects, etc. Implicitly creating extra copies of these things which are kind of like new threads of execution sounds potentially awful to me.

Also—if you wanted to use generators in a really coroutine-like way, like task.js does, under your scheme you'd have to explicitly call the @iterator method in order to get the object you want, the one that has .next(), .send(), .throw(), and so on. (Not a showstopper, as it's going to be pretty specialized code that does that.)

I can see that the suggestion might look like a complication, but I

think it is a fairly minor one, and more importantly, in practice will almost always be confined to abstractions.

I agree that in most use cases, no difference will be observed.

# Andreas Rossberg (12 years ago)

On 15 March 2013 02:33, Jason Orendorff <jason.orendorff at gmail.com> wrote:

On Thu, Mar 14, 2013 at 3:48 PM, Andreas Rossberg <rossberg at google.com> wrote:

On 8 March 2013 18:23, Jason Orendorff <jason.orendorff at gmail.com> wrote:

and you definitely don't want new state here, because what would that even mean? A read position is kind of inherent to a file descriptor, right?

A generator is an abstraction that is intended to be invokable many times. Are you saying that there are generators for which you cannot do that?

Eh? No, I'm saying you generally don't want to restart functions automatically and implicitly.

Hm, where do you see "automatically and implicitly"? The whole point of the proposal is to be more explicit about when a fresh iterator is created, and who expects whom to do that.

Also—if you wanted to use generators in a really coroutine-like way, like task.js does, under your scheme you'd have to explicitly call the @iterator method in order to get the object you want, the one that has .next(), .send(), .throw(), and so on. (Not a showstopper, as it's going to be pretty specialized code that does that.)

Yes, but that would also occur inside the task.js abstractions, wouldn't it?

# Andreas Rossberg (12 years ago)

On 15 March 2013 01:13, Brendan Eich <brendan at mozilla.com> wrote:

Andreas Rossberg wrote:

See my reply to Jason: I think that in most practical cases (in particular, all abstractions over iterables), the invocation of the iterator method will happen inside an abstraction, and the programmer does not have to worry about it.

Talking 1:1 with you after the TC39 meeting, it came out that the ES6 spec does not say that iterators are iterables whose @iterator does "return self". That changes things, but still makes a messier contract than you prefer.

Yes. Iterators have a next method, that's all what makes them an iterator, according to the wiki, and having an iterator method is never mentioned there. The only place where such a case shows up in the proposals is for generator objects.

The contract you prefer is one where iterables have @iterator and calling it gets a (mutable) iterator that is not an iterable. It would require, as you proposed, making generators return iterables not iterators -- an extra allocation.

At the level of contract cleanliness and usability, that may be better than the Pythonic convention -- I'm not sure. Cc'ing Jason.

At the level of extra allocations, I still say "boo".

I'd say that one allocation per loop is perfectly affordable -- and is likewise required for packaging up the return value. For both it is easy to avoid ever materializing the extra object in the common case of a for-loop.

# Dmitry Lomov (12 years ago)

In the ideal (from my point of view) world, no object will have both an iterator() and a next() method together (so Iterator and Iterable would be different entities; the first having an internal state, and the second stateless). So your example would be:

var iterable = getSomeIterable(); var i = it.iterator(); var x0 = i.next(), x1 = i.next(), x2 = i.next(); for (let x of iterable) { ... }

and 'for .. of', very clearly from this code, restarts iteration.

I can imagine the world where 'for .. of' iterates over iterators as well (by calling their 'next()' method directly; so the spec would be "if iterator() exists, use it, otherwise if next() exists, use it"). In this world 'for(let x of i)' would continue a started iteration while 'for (let x of iterable)' would start a new one.

I can also imagine the world where the iterator changes its nature - once it is created, potentially an iterator and potentially an iterable. Once you call next() on it, you've lost the ability to call an iterator() on it. I think that would be the logical implication of cloning semantics. But the more I think of it the more I feel that this way lies madness - so I guess you are right and a sane cloning semantic does not exist.

Dmitry

# Brandon Benvie (12 years ago)

On Mar 14, 2013, at 8:01 PM, Andreas Rossberg <rossberg at google.com> wrote:

Yes. Iterators have a next method, that's all what makes them an iterator, according to the wiki, and having an iterator method is never mentioned there.

It's mentioned on the iterators proposal page as Iterator.prototype.iterator.

# Waldemar Horwat (12 years ago)

On 03/14/2013 04:14 PM, Brendan Eich wrote:

Andreas Rossberg wrote:

On 14 March 2013 23:38, Brendan Eich<brendan at mozilla.com> wrote:

Andreas Rossberg wrote:

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s). That's horribly inconvenient. It takes Dmitry's example:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

which contains a bug under the Harmony proposal, to this:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator@iterator, rangeAsGenerator@iterator)

No, why? The zip function invokes the iterator method for you.

Sure, but only if you know that. I thought you were advocating explicit iterator calls.

A call expression cannot be assumed to return a result that can be consumed by some mutating protocol twice, in general. Why should generator functions be special?

I agree they could be special-cased, but doing so requires an extra allocation (the generator-iterable that's returned).

Meanwhile the Pythonic pattern is well-understood, works fine, and (contra Dmitry's speculation) does not depend on class-y OOP in Python.

I guess it's the season of extra allocations, but still: in general when I consume foo() via something that mutates its return value, I do not expect to be able to treat foo() as referentially transparent. Not in JS!

Does for-of accept only iterables, only iterators, or both? Presumably a function like zip would make a similar decision. The problem is if the answer is "both".

 Waldemar
# Allen Wirfs-Brock (12 years ago)

On Mar 15, 2013, at 5:55 PM, Waldemar Horwat wrote:

On 03/14/2013 04:14 PM, Brendan Eich wrote:

Andreas Rossberg wrote:

On 14 March 2013 23:38, Brendan Eich<brendan at mozilla.com> wrote:

Andreas Rossberg wrote:

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s). That's horribly inconvenient. It takes Dmitry's example:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

which contains a bug under the Harmony proposal, to this:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator@iterator, rangeAsGenerator@iterator)

No, why? The zip function invokes the iterator method for you.

Sure, but only if you know that. I thought you were advocating explicit iterator calls.

A call expression cannot be assumed to return a result that can be consumed by some mutating protocol twice, in general. Why should generator functions be special?

I agree they could be special-cased, but doing so requires an extra allocation (the generator-iterable that's returned).

Meanwhile the Pythonic pattern is well-understood, works fine, and (contra Dmitry's speculation) does not depend on class-y OOP in Python.

I guess it's the season of extra allocations, but still: in general when I consume foo() via something that mutates its return value, I do not expect to be able to treat foo() as referentially transparent. Not in JS!

Does for-of accept only iterables, only iterators, or both? Presumably a function like zip would make a similar decision. The problem is if the answer is "both".

Strictly speaking for-of only accepts iterables since it always invokes the @@iterator on the value to the RHS of the 'of' keyword. For-of expects to get a iterator back from the @@iterator call.

Using informal interface descriptions this can be described as:

interface <iterable> { @@iterator: ()-> <iterator> }

interface <iterator> { next: () -> <nextResult> }

interface <nexdtResult> { done: <boolean>, value: <any> }

class Array implements <iterable> {...}

class Map implements <iterable> {...}

class Uint32Array implements <iterable> {...}

etc.

class "builtinArrayIterator" implements < iterator >+< iterable > {...}

class "builtinMapIterator" implements < iterator >+< iterable > {...}

The dual nature of the built-in iterators is nice because it allows formulation like:

for (v of myArray) ... //iterate over an array using its default iterator for ([k,v] of myArray.entries()) ... //iterate over an array using an alternative iterator as the iterable for (ev of (for (v of myArray) if (v&1==0) v)) ... //the iterable is a generator expression

My understanding of Andrea's concerns is that there is a hazard that somebody might try to invoke @@iterator more than once on an iterator that was also an iterable. This isn't a hazard at all, in the context of for-of and the built-ins. for-of never calls @@iterator more than once in that manner. Multiple invocations of @@iterable is only possible using explicit user written calls. Also all the built-in iterable collections are specified to return a new iterable instance for each call to @@iterable so there shouldn't be any concern about accidentally getting the same iterator back from any built-in collection.

If we are concerned about the hazard of users coding multiple @@iterator calls to a single built-in iterator instance, the simplest solution is probably to define @@iterator for them such that they throw if the iterator instance is not in its initial state. User defined iterators are, of course, free to define @@iterator in any manner they wish. They might "clone" the iterator, or throw or whatever they wanted.

# Andreas Rossberg (12 years ago)

On 16 March 2013 07:34, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 15, 2013, at 5:55 PM, Waldemar Horwat wrote:

On 03/14/2013 04:14 PM, Brendan Eich wrote:

Andreas Rossberg wrote:

On 14 March 2013 23:38, Brendan Eich<brendan at mozilla.com> wrote:

Andreas Rossberg wrote:

That leaves my original proposal not to have generator application return an iterator, but only an iterable. Under that proposal, your example requires disambiguation by inserting the intended call(s) to .iterator in the right place(s). That's horribly inconvenient. It takes Dmitry's example:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator, rangeAsGenerator) // Oops!

which contains a bug under the Harmony proposal, to this:

function* enum(from, to) { for (let i = from; i<= to; ++i) yield i }

let rangeAsGenerator = enum(1, 4) let dup = zip(rangeAsGenerator@iterator, rangeAsGenerator@iterator)

No, why? The zip function invokes the iterator method for you.

Sure, but only if you know that. I thought you were advocating explicit iterator calls.

I have to know that, either way, because that is a fundamental part of the function's contract. Its interface can be sketched as

zip : (iterable, iterable) -> iterable

That implies that it takes care of invoking the iterator method of its argument.

A call expression cannot be assumed to return a result that can be consumed by some mutating protocol twice, in general. Why should generator functions be special?

I agree they could be special-cased, but doing so requires an extra allocation (the generator-iterable that's returned).

Meanwhile the Pythonic pattern is well-understood, works fine, and (contra Dmitry's speculation) does not depend on class-y OOP in Python.

I guess it's the season of extra allocations, but still: in general when I consume foo() via something that mutates its return value, I do not expect to be able to treat foo() as referentially transparent. Not in JS!

Does for-of accept only iterables, only iterators, or both? Presumably a function like zip would make a similar decision. The problem is if the answer is "both".

Strictly speaking for-of only accepts iterables since it always invokes the @@iterator on the value to the RHS of the 'of' keyword. For-of expects to get a iterator back from the @@iterator call.

Using informal interface descriptions this can be described as:

interface <iterable> { @@iterator: ()-> <iterator> }

interface <iterator> { next: () -> <nextResult> }

interface <nexdtResult> { done: <boolean>, value: <any> }

class Array implements <iterable> {...} class Map implements <iterable> {...} class Uint32Array implements <iterable> {...} etc.

class "builtinArrayIterator" implements < iterator >+< iterable > {...} class "builtinMapIterator" implements < iterator >+< iterable > {...}

The dual nature of the built-in iterators is nice because it allows formulation like:

for (v of myArray) ... //iterate over an array using its default iterator for ([k,v] of myArray.entries()) ... //iterate over an array using an alternative iterator as the iterable for (ev of (for (v of myArray) if (v&1==0) v)) ... //the iterable is a generator expression

None of these examples requires built-in iterators to also be iterables, or generator results to also be iterators. For-of always invokes @@iterator, so it's sufficient here if generator objects and collections are plain iterables.

My understanding of Andrea's concerns is that there is a hazard that somebody might try to invoke @@iterator more than once on an iterator that was also an iterable.

My concern is that I want to be able to invoke @@iterator more than once on any iterable. Otherwise iterable-based abstractions like zip will not work with all (combinations of) iterables. Which would be a real shame.

In other words, iterables should always be proper factories for iterators (at least the ones provided by the language, obviously we cannot make any guarantees whether user-defined ones properly implement that contract).

Generator objects, as currently drafted, don't meet that goal (Python precedence notwithstanding). In general, no object simultaneously trying to be both an iterator and an iterable (returning itself) sanely can. You don't get a duality but an inconsistency.

# Allen Wirfs-Brock (12 years ago)

On Mar 7, 2013, at 11:05 AM, Andreas Rossberg wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 7, 2013, at 7:37 AM, Andreas Rossberg wrote:

  1. Are the methods of a generator object installed as frozen properties? (I hope so, otherwise it would be more difficult to aggressively optimise generators.)

We discussed the factoring of the generator objects at the Nov 27 meeting. rwldrn/tc39-notes/tree/master/es6/2012-11 there is a sketch of the hierarchy we agreed to in in the notes dl.dropbox.com/u/3531958/tc39/generator-diagram-1.jpg

TThe design I presented at the meeting is very close to that final one, answers this question, and is easier to read: meetings:proposed_generator_class_hierarcy_nov_2013.png

Ah, thanks. It seems that I missed that part of the meeting. I'll have a look.

I created a more readable class hierarchy diagram based on the above photo from the Nov. meeting. This is what I'm using as my guide for putting generators into the spec. draft. The diagram is at harmony:es6_generator_object_model_3-29-13.png

The most significant change from the meeting (and it really wasn't explicit on the whiteboard) is that generator prototypes don't have a "constructor" property that links back to its generator function instance. In other words, you can't say:

function * ofCollection() {for (i of collection) yield i}; var itr1 = ofCollection(); var itr2 = iter1.constructor(); //TypeError

Allowing such constructor access seems like an unnecessary (and possibly undesireable) capability when you just passing generator instances around to be used as iterators. Anybody disagree?

itr1 instanceof ofCollection

still works (subject to the usual instanceof caveats)

# Mark S. Miller (12 years ago)

On Fri, Mar 29, 2013 at 6:21 PM, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:

On Mar 7, 2013, at 11:05 AM, Andreas Rossberg wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 7, 2013, at 7:37 AM, Andreas Rossberg wrote:

  1. Are the methods of a generator object installed as frozen

properties? (I hope so, otherwise it would be more difficult to

aggressively optimise generators.)

We discussed the factoring of the generator objects at the Nov 27 meeting.

rwldrn/tc39-notes/tree/master/es6/2012-11 there is a

sketch of the hierarchy we agreed to in in the notes

dl.dropbox.com/u/3531958/tc39/generator-diagram-1.jpg

TThe design I presented at the meeting is very close to that final one,

answers this question, and is easier to read:

meetings:proposed_generator_class_hierarcy_nov_2013.png

Ah, thanks. It seems that I missed that part of the meeting. I'll have a look.

I created a more readable class hierarchy diagram based on the above photo from the Nov. meeting. This is what I'm using as my guide for putting generators into the spec. draft. The diagram is at harmony:es6_generator_object_model_3-29-13.png

The most significant change from the meeting (and it really wasn't explicit on the whiteboard) is that generator prototypes don't have a "constructor" property that links back to its generator function instance. In other words, you can't say:

function * ofCollection() {for (i of collection) yield i}; var itr1 = ofCollection(); var itr2 = iter1.constructor(); //TypeError

That wouldn't be a type error. It would be invoking the value of the inherited generator property.

# Mark S. Miller (12 years ago)

On Sat, Mar 30, 2013 at 6:13 AM, Mark S. Miller <erights at google.com> wrote:

On Fri, Mar 29, 2013 at 6:21 PM, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:

On Mar 7, 2013, at 11:05 AM, Andreas Rossberg wrote:

On 7 March 2013 18:30, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 7, 2013, at 7:37 AM, Andreas Rossberg wrote:

  1. Are the methods of a generator object installed as frozen

properties? (I hope so, otherwise it would be more difficult to

aggressively optimise generators.)

We discussed the factoring of the generator objects at the Nov 27 meeting.

rwldrn/tc39-notes/tree/master/es6/2012-11 there is a

sketch of the hierarchy we agreed to in in the notes

dl.dropbox.com/u/3531958/tc39/generator-diagram-1.jpg

TThe design I presented at the meeting is very close to that final one,

answers this question, and is easier to read:

meetings:proposed_generator_class_hierarcy_nov_2013.png

Ah, thanks. It seems that I missed that part of the meeting. I'll have a look.

I created a more readable class hierarchy diagram based on the above photo from the Nov. meeting. This is what I'm using as my guide for putting generators into the spec. draft. The diagram is at harmony:es6_generator_object_model_3-29-13.png

The most significant change from the meeting (and it really wasn't explicit on the whiteboard) is that generator prototypes don't have a "constructor" property that links back to its generator function instance. In other words, you can't say:

function * ofCollection() {for (i of collection) yield i}; var itr1 = ofCollection(); var itr2 = iter1.constructor(); //TypeError

That wouldn't be a type error. It would be invoking the value of the inherited generator property.

Sorry, "...inherited constructor property" that is.

# André Bargull (12 years ago)

/ The most significant change from the meeting (and it really wasn't />>/ explicit on the whiteboard) is that generator prototypes don't have a />>/ "constructor" property that links back to its generator function instance. />>/ In other words, you can't say: />>/ />>/ function * ofCollection() {for (i of collection) yield i}; />>/ var itr1 = ofCollection(); />>/ var itr2 = iter1.constructor(); //TypeError />>/ />/ />/ That wouldn't be a type error. It would be invoking the value of the />/ inherited generator property. />/ / Sorry, "...inherited constructor property" that is.

But the inherited constructor property is not callable, so it's still a TypeError:

js> function * ofCollection() {for (let i of collection) yield i}; js> var itr1 = ofCollection(); js> var itr2 = itr1.constructor(); uncaught exception: TypeError: object is not a function

  • André
# Allen Wirfs-Brock (12 years ago)

On Mar 30, 2013, at 7:05 AM, André Bargull wrote:

The most significant change from the meeting (and it really wasn't explicit on the whiteboard) is that generator prototypes don't have a "constructor" property that links back to its generator function instance. In other words, you can't say:

function * ofCollection() {for (i of collection) yield i}; var itr1 = ofCollection(); var itr2 = iter1.constructor(); //TypeError

That wouldn't be a type error. It would be invoking the value of the inherited generator property.

Sorry, "...inherited constructor property" that is.

But the inherited constructor property is not callable, so it's still a TypeError:

Exactly, I had assumed that was obvious from the diagram :-)