iterate and enumerate trap signature inconsistency
On Sun, Sep 2, 2012 at 11:13 AM, David Bruant <bruant.d at gmail.com> wrote:
Hi,
The enumerate (for..in loops) and iterate (for..of loops) traps have inconsistent signatures. The former needs an array of strings to be returns, the latter an iterator.
I would tend to be in favor of both returning an iterator to avoid allocating, filling & freeing memory in case of recurrent for..in/of loops on the object. It's also somewhat more aligned with the idea of a loop: if the loop contains a 'break' statement, in the array case, some memory is left unused, but in the iterator case, the iterator just stops being called if control gets out of the loop.
The difference in signatures is intentional, though I don't know how required it is.
By the way, what's the difference between an iterator and a generator? Would it make sense that both traps return a generator rather than an iterator? Why?
A generator is a special kind of function that, when called, returns an iterator over its own yield values.
On Sep 2, 2012, at 11:13 AM, David Bruant wrote:
Hi,
The enumerate (for..in loops) and iterate (for..of loops) traps have inconsistent signatures. The former needs an array of strings to be returns, the latter an iterator.
the current ES6 spec. draft uses internal methods, [[Enumerate]] and [[Iterate]]. The implementation of these methods for proxy objects (will, its not in the draft yet) call the corresponding handlers.
Both of the internal methods are spec'ed to return an object that implements the iterator interface (generator instances comply).
If you want a the set of enumerated/iterated items to be fixed before the start of a loop, the handler can always create a fixed collection of values and return an iterator over that.
The enumerate() trap predates the TC39 blessing of iterators, but IIRC we did agree to revisit its signature once iterators were in. I think having the enumerate() trap return an iterator is perfectly sensible. There's a catch, however:
As currently specified, direct proxies enforce that the enumerate() trap doesn't return duplicate property names. Now, as Allen hints at, we could make the proxy return its own outer iterator that exhausts the inner iterator returned by the enumerate() trap to insert extra checks. But for the proxy to check for duplicates, it must remember all property names previously produced by the inner iterator. So the main benefit of using an iterator, namely that the entire collection of generated values doesn't need to be retained in memory, seems lost.
Le 11/09/2012 21:37, Tom Van Cutsem a écrit :
The enumerate() trap predates the TC39 blessing of iterators, but IIRC we did agree to revisit its signature once iterators were in. I think having the enumerate() trap return an iterator is perfectly sensible. There's a catch, however:
As currently specified, direct proxies enforce that the enumerate() trap doesn't return duplicate property names. Now, as Allen hints at, we could make the proxy return its own outer iterator that exhausts the inner iterator returned by the enumerate() trap to insert extra checks. But for the proxy to check for duplicates, it must remember all property names previously produced by the inner iterator. So the main benefit of using an iterator, namely that the entire collection of generated values doesn't need to be retained in memory, seems lost.
I don't remember what the rational of the duplicate check was. Do we really need it? A lot of invariants are here for the sake of robustness/security (especially non-configurable/non-writable/non-extensible-related invariants), but it doesn't seem duplicate keys will make code less robust or secure.
Unless there are strong reasons to preserve this invariant, it could be dropped, couldn't it?
On Tue, Sep 11, 2012 at 12:50 PM, David Bruant <bruant.d at gmail.com> wrote:
I don't remember what the rational of the duplicate check was. Do we really need it? A lot of invariants are here for the sake of robustness/security (especially non-configurable/non-writable/non-extensible-related invariants), but it doesn't seem duplicate keys will make code less robust or secure.
Unless there are strong reasons to preserve this invariant, it could be dropped, couldn't it?
It is a tradeoff. Without relying on some guarantees, it is impossible to write correctly defensive code. With every guarantee that is 1) enforced by ES5/strict, 2) typically obeyed in ES6, and 3) convenient to rely on, there likely will be ES5-era code whose correctness relies on these guarantees. When new versions of the language un-enforce a previously enforced guarantee, old correctly defensive code can fail to defend against attacks possible only on the new platform.
OTOH, in the extreme, to not weaken any guarantees is to prevent the language from evolving to provide more power. Both Proxies and Object.observe were carefully designed to strike a good balance between these pressures. For the old enumerate trap design, the cost vs benefit of preserving the no-duplicate guarantee seemed worth it. If we change this API from returning an array of strings to returning an iterator, I agree that alters the balance and justifies waiving this particular guarantee.
2012/9/11 Mark S. Miller <erights at google.com>
[...] If we change this API from returning an array of strings to returning an iterator, I agree that alters the balance and justifies waiving this particular guarantee.
I also don't see any real issues with waiving the duplicate names check. It's unrelated to the usual non-configurability/non-extensibility invariants.
Just to spell it out, that would mean that in the following code:
for (var name in obj) { ... }
where obj is either a proxy or an object with a proxy in its prototype chain, |name| can be bound to the same string multiple times through the loop.
Note that even if we would waive the duplicate property check, the proxy still needs to return a wrapped iterator for the enumerate() trap to:
- coerce each produced value to a String
- check whether all non-configurable enumerable properties of the target have been produced. One way to check this would be to retrieve these properties in a set before the start of the loop, and remove each encountered property from the set. If, after the very last loop iteration, the set is non-empty, the proxy throws a TypeError.
Le 12/09/2012 08:19, Tom Van Cutsem a écrit :
2012/9/11 Mark S. Miller <erights at google.com <mailto:erights at google.com>>
[...] If we change this API from returning an array of strings to returning an iterator, I agree that alters the balance and justifies waiving this particular guarantee.
I also don't see any real issues with waiving the duplicate names check. It's unrelated to the usual non-configurability/non-extensibility invariants.
Just to spell it out, that would mean that in the following code:
for (var name in obj) { ... }
where obj is either a proxy or an object with a proxy in its prototype chain, |name| can be bound to the same string multiple times through the loop.
There are numerous examples where ES5 assumptions are questioned by proxies, like:
for(var p in obj){
var desc = Object.getOwnPropertyDescriptor(obj, p)
// ...
}
In this snippet, ES5 semantics guarantees that desc will be an object at every iteration. Proxies challenge this assumption (because of the enumerate trap, but also the combination between the enumerate and getOwnPropertyDescriptor trap). But as Mark said, there is a trade-off to be found.
Note that even if we would waive the duplicate property check, the proxy still needs to return a wrapped iterator for the enumerate() trap to:
- coerce each produced value to a String
- check whether all non-configurable enumerable properties of the target have been produced. One way to check this would be to retrieve these properties in a set before the start of the loop, and remove each encountered property from the set. If, after the very last loop iteration, the set is non-empty, the proxy throws a TypeError.
Sounds good and it doesn't sound like there is additional cost against the current design. For the second check, I'd like to note a couple of implementation details which are relevant I think. First, as you suggest, the set of minimum reported properties has to be decided before the first iteration (otherwise, this set may change over the course of the loop). Second, an unanswered question is how this set is being retrieved. Is the JS runtime performing an Object.getOwnPropertyDescriptor on the proxy (calling the trap)? on the target directly? If the target is itself a proxy, on the target at the end of the chain (so that no intermediate trap is being called)? Right now, I have no preference for any of these solutions, but it's a detail that will be visible from user script, so it needs to be decided for the sake of interoperability.
2012/9/12 David Bruant <bruant.d at gmail.com>
[...] For the second check, I'd like to note a couple of implementation details which are relevant I think. First, as you suggest, the set of minimum reported properties has to be decided before the first iteration (otherwise, this set may change over the course of the loop). Second, an unanswered question is how this set is being retrieved. Is the JS runtime performing an Object.getOwnPropertyDescriptor on the proxy (calling the trap)? on the target directly? If the target is itself a proxy, on the target at the end of the chain (so that no intermediate trap is being called)? Right now, I have no preference for any of these solutions, but it's a detail that will be visible from user script, so it needs to be decided for the sake of interoperability.
As currently specified (< harmony:proxies_spec#changes_to_es5_built-in_functions>,
algorithm 12.6.4, step 6.k.), the proxy calls the built-in Reflect.keys on its internal [[Target]] to retrieve the target's enumerable properties, then checks for configurability of these properties via [[GetOwnProperty]]. This can all stay the same with the iterator design.
On Wed, Sep 12, 2012 at 1:27 AM, David Bruant <bruant.d at gmail.com> wrote:
[...] There are numerous examples where ES5 assumptions are questioned by proxies, like:
for(var p in obj){ var desc = Object.getOwnPropertyDescriptor(obj, p) // ... }
In this snippet, ES5 semantics guarantees that desc will be an object at every iteration. Proxies challenge this assumption (because of the enumerate trap, but also the combination between the enumerate and getOwnPropertyDescriptor trap). But as Mark said, there is a trade-off to be found. [...]
The most important ES5 guarantee waived by proxies is when interleaving of code from accessed object can occur. Assuming Object.getOwnPropertyDescriptor is the original Object.getOwnPropertyDescriptor, in the code above, no code from p can interleave in the above actions. Proxies waive this many places, including in this example. This has been the biggest newly introduced hazard, and we've tried to make tasteful choices about where this can and cannot occur.
However, given this newly introduced interleaving hazard, your example does not demonstrate a further weakening of invariants. This is the "momentary invariant" vs "eternal invariant" issue explained at < esdiscuss/2011-May/014150>. It is
also why Tom's message mentions "non-configurable enumerable [own] properties of the target". The fact that they are enumerated is an eternal invariant.
The enumerate (for..in loops) and iterate (for..of loops) traps have inconsistent signatures. The former needs an array of strings to be returns, the latter an iterator.
I would tend to be in favor of both returning an iterator to avoid allocating, filling & freeing memory in case of recurrent for..in/of loops on the object. It's also somewhat more aligned with the idea of a loop: if the loop contains a 'break' statement, in the array case, some memory is left unused, but in the iterator case, the iterator just stops being called if control gets out of the loop.
By the way, what's the difference between an iterator and a generator? Would it make sense that both traps return a generator rather than an iterator? Why?