for-in evaluation order

# Michael Day (13 years ago)

Some quick testing with Firefox seems to show that for-in returns properties in the order they were added to the object, and some scripts on the web (eg. RaphaelJS) seem to depend on that, unless I'm horrendously misinterpreting them. This seems... bad?

Are objects implicitly ordered, by implementation convention?

# Boris Zbarsky (13 years ago)

On 12/27/10 5:39 AM, Michael Day wrote:

Some quick testing with Firefox seems to show that for-in returns properties in the order they were added to the object

Yep; it's required for web compat, as you discovered. When we accidentally broke it, we very quickly got bug reports. ;)

and some scripts on the web (eg. RaphaelJS) seem to depend on that

Indeed.

unless I'm horrendously misinterpreting them. This seems... bad?

Well, it's not bad except for ECMA's insistence on not specifying the order.

Are objects implicitly ordered, by implementation convention?

For implementations that have to deal with the web, yes.

# Dmitry A. Soshnikov (13 years ago)

On 27.12.2010 14:39, Michael Day wrote:

Hi,

Some quick testing with Firefox seems to show that for-in returns properties in the order they were added to the object, and some scripts on the web (eg. RaphaelJS) seem to depend on that, unless I'm horrendously misinterpreting them. This seems... bad?

Yes, it's bad. Because the spec really statements that the order of evaluation isn't specified. Moreover, some side-effects, e.g. deleteing of properties or adding new is also not specified directly, though there is a mention on it. IIRC, there was a case with JScript (IE) and these side-effects. Regarding non-specificity enumeration, there was/is a case with V8 (Chrome).

However, IMO, it's bad from the spec viewpoint (even if such a agreement is taken in price of optimization). I think (and sure many programmers will agree) that the consistent is more important. And the syntactic (lexical) appearing properties in the object should be enumerated in the same order. Adding/removing while enumeration also should be handled correctly.

Are objects implicitly ordered, by implementation convention?

In some implementations -- yes, in some -- not. So, it's not recommended to rely on it. Though, repeat, I'd change the spec instead.

Dmitry.

# Michael Day (13 years ago)

Thanks Boris, Dmitry,

Are objects implicitly ordered, by implementation convention?

For implementations that have to deal with the web, yes.

Okay, now I know what we have to do then :)

Might I suggest that this be added to the spec, if only in an informative note, to save future implementers the trouble?

Best ,

Michael

# Andreas Gal (13 years ago)

Note that almost all implementations break for-in enumeration order for dense arrays, for some definition of dense.

Example (firefox):

a = [] a[2] = 1 a[1] = 1 a[0] = 1 for (i in a) alert(i)

will alert 0, 1, 2 (in that order).

Now if you do a["x"] = 1

you get 0, 1, 2, "x"

However, if we start out with x, we get this:

a = [] a["x"] = 1 a[2] = 1 a[1] = 1 a[0] = 1

Will alert x, 2, 1, 0

Even worse, this is likely to change in the future (we want to give all objects a dense, indexed part).

Most browser behave similar with respect to indexed properties. For good performance you really want to ignore enumeration order for array-like objects, but if the implementation falls back onto regular objects, you can suddenly observe true property addition order.

If we try to specify something, implementors will probably object, and each implementation will have specific objections depending on their particular array and object representation.

# Dmitry A. Soshnikov (13 years ago)

On 27.12.2010 15:18, Andreas Gal wrote:

Note that almost all implementations break for-in enumeration order for dense arrays, for some definition of dense.

Example (firefox):

a = [] a[2] = 1 a[1] = 1 a[0] = 1 for (i in a) alert(i)

will alert 0, 1, 2 (in that order).

Now if you do a["x"] = 1

you get 0, 1, 2, "x"

However, if we start out with x, we get this:

a = [] a["x"] = 1 a[2] = 1 a[1] = 1 a[0] = 1

Will alert x, 2, 1, 0

Hm, interesting (didn't know about SpiderMonkey), thanks.

Even worse, this is likely to change in the future (we want to give all objects a dense, indexed part).

Most browser behave similar with respect to indexed properties. For good performance you really want to ignore enumeration order for array-like objects, but if the implementation falls back onto regular objects, you can suddenly observe true property addition order.

Yep, indeed, with a = {} the order is as was specified in the source.

If we try to specify something, implementors will probably object, and each implementation will have specific objections depending on their particular array and object representation.

It's interesting. Could you summarize these objections?

Dmitry.

# Michael Day (13 years ago)

Note that almost all implementations break for-in enumeration order for dense arrays, for some definition of dense.

This is understandable and very sensible.

If we try to specify something, implementors will probably object, and each implementation will have specific objections depending on their particular array and object representation.

Ugh. Tempting to randomise the order for non-dense objects, or just beat developers with the clue-bat until they stop writing code that depends upon a consistent order :)

Best ,

Michael

# Andreas Gal (13 years ago)

I can't speak for other VM implementors, but I think most implementors will not want to guarantee either enumeration method (property addition order vs numeric order). In some cases we store array data "dense", in others we punt and use a natural object representation. The decision when we switch between representations is a brittle heuristics. Allowing the underlying representation and along with it the enumeration order to bleed through is faster than normalizing (which in some cases would mean sorting property indexed names).

That having said, if TC39 decides to standardize enumeration order, I am pretty sure we can make pretty much any standard enumeration order reasonably fast (maybe a little less fast than before though). Mark suggested at some point enumerating indexed properties first in numeric order, and then all remaining properties. Its the most sensible approach I have heard so far. If someone drafts a proposal to standardize enumeration order, I volunteer to implement it in Firefox and get some performance data. That would help calming implementor's nerves.

# Dmitry A. Soshnikov (13 years ago)

On 27.12.2010 15:40, Andreas Gal wrote:

I can't speak for other VM implementors, but I think most implementors will not want to guarantee either enumeration method (property addition order vs numeric order). In some cases we store array data "dense", in others we punt and use a natural object representation. The decision when we switch between representations is a brittle heuristics.

I see, so this heuristic decisions can spread also on a = {} in Firefox (I mean it's not currently guaranteed in exactly Firefox that the enumeration order of {} will be the same as specified by addition -- as we saw in the previous example)?

Can you also show a memory representation (at lower level) of "dense" and "sparse" (?) data. It will help to understand the advantages of choosing the needed order by implementers. It's not required to show it right now, but, if possible -- when you'll have a time to show it.

Allowing the underlying representation and along with it the enumeration order to bleed through is faster than normalizing (which in some cases would mean sorting property indexed names).

Yes, abstractly it's clear, but if to see the memory representation (since e.g. I don't know well C-level of representation well) it would be even better.

That having said, if TC39 decides to standardize enumeration order, I am pretty sure we can make pretty much any standard enumeration order reasonably fast (maybe a little less fast than before though).

Hm, so it's not about serious technical performance issues and any needed order can be implemented. I see, thanks.

Mark suggested at some point enumerating indexed properties first in numeric order, and then all remaining properties. Its the most sensible approach I have heard so far.

It seems sensible (again -- abstractly), but if to see the memory representation, it would be even more sensible.

If someone drafts a proposal to standardize enumeration order, I volunteer to implement it in Firefox and get some performance data. That would help calming implementor's nerves.

Hm... It's not so hard to describe the algorithm abstractly. But I am not aware about which implementation performance issues it may follow.

Dmitry.

# Ash Berlin (13 years ago)

On 27 Dec 2010, at 12:40, Andreas Gal wrote:

I can't speak for other VM implementors, but I think most implementors will not want to guarantee either enumeration method (property addition order vs numeric order). In some cases we store array data "dense", in others we punt and use a natural object representation. The decision when we switch between representations is a brittle heuristics. Allowing the underlying representation and along with it the enumeration order to bleed through is faster than normalizing (which in some cases would mean sorting property indexed names).

That having said, if TC39 decides to standardize enumeration order, I am pretty sure we can make pretty much any standard enumeration order reasonably fast (maybe a little less fast than before though). Mark suggested at some point enumerating indexed properties first in numeric order, and then all remaining properties. Its the most sensible approach I have heard so far. If someone drafts a proposal to standardize enumeration order, I volunteer to implement it in Firefox and get some performance data. That would help calming implementor's nerves.

Andreas

As an interesting related tidbit: ruby 1.9 changed the key order on its hashes from sorted to insertion order:

www.markhneedham.com/blog/2010/09/07/ruby-hash-ordering, www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash

# Allen Wirfs-Brock (13 years ago)

And I don't believe that anybody on this thread yet mentioned the impact of inherited properties on enumeration order.

ES5 did not specify a an enumeration order because we discovered that there was not a de facto for-in order that was used by all browser implementation. The more edge cases that were examined, the more differences we found.

We did identify one situation where enumeration order will be the same across all major implementation that are currently in use (including IE6):

The enumeration order of an object's properties will be the order in which the properties were added if all the following conditions hold: The object has no inherited enumerable properties The object has no array indexed properties No properties have been deleted No property has had its attributes modified or been changed from a data property to an accessor property or visa versa

In practice this means that if you create an object like: var a = {}; a.y=1; a.z=2; a.x=3;

you can depend upon the enumeration order being '"y", "z", "x". If you do anything more complex including using integer property names you can not depend upon the order being the same across all browsers.

Any code that does is buggy.

# Wes Garland (13 years ago)

On Mon, Dec 27, 2010 at 8:04 AM, Dmitry A. Soshnikov < dmitry.soshnikov at gmail.com> wrote:

Can you also show a memory representation (at lower level) of "dense" and "sparse" (?) data. It will help to understand the advantages of choosing the needed order by implementers. It's not required to show it right now, but, if possible -- when you'll have a time to show it.

I haven't looked at this code in SpiderMonkey for a very long time, but IIRC, here are the highlights:

  • All ECMAScript values are stored in a type called "jsval", which is basically a tagged variant type
  • jsvals can represent integers, doubles (firefox 4 - older stores double pointers), string pointers, object pointers, and bools
  • Firefox 4 uses 64-bit jsvals and a bit more specialization information in the tags; older uses 32-bit jsvals
  • Key fact: regular arrays are basically just objects, but dense arrays have a highly specialized implementation as of firefox 3
  • dense arrays are arrays with few holes and numeric indices
  • dense arrays are represented internally essentially as C arrays of jsval with matching indices. So a lookup is very very cheap. ("matching indices" might be a slight lie, I think there might be an offset for arrays big leading holes etc)
  • regular object properties are stored in "slots", which are really just elements in a C array of jsvals
  • there is a map which allows us to look up which slot a property is stored in
  • this map (and the slots) are populated in first-write (declaration) order
  • I believe this implementation is the origin of the defacto enumeration order relied on by the web; the enumerator simply traverses the map in storage order

So you see, looking up an index in a dense array is really cheap - you take the value of the index, convert it to a C int (usually just a bit shift), multiply it by the size of the jsval, add that to the pointer to the start of the storage, and voila - you have a pointer to the jsval which represents the value stored. You do not have the overhead of traversing the map to determine the offset, nor the overhead of storing the map.

Andreas mentioned the possibility of an object representation which has both a map/slots and a dense array component - this is interesting, as it makes representing user-defined array-like objects more efficient --- I believe that adding a length property in the current implementation forces a non-dense representation of the array.

I think Mark's suggestion here is very sensible. My gut tells me that it has a low probability of breaking the web, yet it offers room for implementers to provide a fast path for numeric object indices while offering an opportunity to codify standard practice.

# Mark S. Miller (13 years ago)

On Mon, Dec 27, 2010 at 9:54 AM, Wes Garland <wes at page.ca> wrote:

I think Mark's suggestion here is very sensible. My gut tells me that it has a low probability of breaking the web, yet it offers room for implementers to provide a fast path for numeric object indices while offering an opportunity to codify standard practice.

I like Dave Herman's strawman at < strawman:enumeration>. It is a nice

simple deterministic spec that has the pleasant index-first property mentioned above. It also plugs the other annoying source of non-determinism -- what happens when insertions and/or deletions happen during enumeration -- by mandating a snapshotting semantics. Together, I think this is a good blend of efficiency, implementability, and programmer understandability.

Dave, under the spec for "Operation OwnProperties(obj)" step 1, you don't explicitly state that these index properties are to be enumerated in numeric order. An oversight? Also, you misstate that indexes are "properties with values that can be converted to numbers via ToUInt32". I think you meant something more like the text from 15.4:

A property name P (in the form of a String value) is an array index if and

# David Herman (13 years ago)

Dave, under the spec for "Operation OwnProperties(obj)" step 1, you don't explicitly state that these index properties are to be enumerated in numeric order. An oversight?

Oops, yes, thanks for catching that. I've updated the wiki.

Also, you misstate that indexes are "properties with values that can be converted to numbers via ToUInt32". I think you meant something more like the text from 15.4:

A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2**32−1.

Yes, that's more precise, thanks. I've also updated this text in the wiki.

Thanks,

# Dmitry A. Soshnikov (13 years ago)

On 27.12.2010 20:54, Wes Garland wrote:

On Mon, Dec 27, 2010 at 8:04 AM, Dmitry A. Soshnikov <dmitry.soshnikov at gmail.com <mailto:dmitry.soshnikov at gmail.com>> wrote:

Can you also show a memory representation (at lower level) of
"dense" and "sparse" (?) data. It will help to understand the
advantages of choosing the needed order by implementers. It's not
required to show it right now, but, if possible -- when you'll
have a time to show it.

I haven't looked at this code in SpiderMonkey for a very long time, but IIRC, here are the highlights:

* All ECMAScript values are stored in a type called "jsval", which
  is basically a tagged variant type
* jsvals can represent integers, doubles (firefox 4 - older stores
  double pointers), string pointers, object pointers, and bools
* Firefox 4 uses 64-bit jsvals and a bit more specialization
  information in the tags; older uses 32-bit jsvals
* Key fact: regular arrays are basically just objects, but dense
  arrays have a highly specialized implementation as of firefox 3
* dense arrays are arrays with few holes and numeric indices
* dense arrays are represented internally essentially as C arrays
  of jsval with matching indices. So a lookup is very very cheap.
  ("matching indices" might be a slight lie, I think there might
  be an offset for arrays big leading holes etc)
* regular object properties are stored in "slots", which are
  really just elements in a C array of jsvals
* there is a map which allows us to look up which slot a property
  is stored in
* this map (and the slots) are populated in first-write
  (declaration) order
* I believe this implementation is the origin of the defacto
  enumeration order relied on by the web; the enumerator simply
  traverses the map in storage order

Thanks a lot, Wes, indeed. It's very useful.

So you see, looking up an index in a dense array is really cheap - you take the value of the index, convert it to a C int (usually just a bit shift), multiply it by the size of the jsval, add that to the pointer to the start of the storage, and voila - you have a pointer to the jsval which represents the value stored.

Yep, I supposed a similar optimization with a constant offset (for the price of enumeration order).

So (to repeat Andreas' example for the complete understanding):

a = [] // created a pointer to a low-level C-array with constant offset for numeric indices

// added values to real C-array indicies a[2] = 1 a[1] = 1 a[0] = 1

// rebuilding a C-array to a hash-table (slots -- C-array with jsvalues) a["x"] = 1

for (k in a) // 0, 1, 2, x -- (seems, an effect of rebuilding).

But:

a = [] // C-array

a["x"] = 1 // rebuild to hash-table

// and then the order is already direct // since anyway there's no optimization in respect of C-array a[2] = 1 a[1] = 1 a[0] = 1

for (k in a) // x, 2, 1, 0

Right?

You do not have the overhead of traversing the map to determine the offset, nor the overhead of storing the map.

Yes, I see.

Andreas mentioned the possibility of an object representation which has both a map/slots and a dense array component - this is interesting, as it makes representing user-defined array-like objects more efficient

Wait, the last case with first adding a["x"] (and then [2], [1], [0]) makes some thirst type of optimized arrays?

--- I believe that adding a length property in the current implementation forces a non-dense representation of the array.

Why? Until there are only index-properties we can change the length up and down. On the other hand, the length then should be stored somewhere else.

I think Mark's suggestion here is very sensible. My gut tells me that it has a low probability of breaking the web, yet it offers room for implementers to provide a fast path for numeric object indices while offering an opportunity to codify standard practice.

Yes, if on the other hand fast arrays with direct constant O(1) offset, I see now that the priority of the enumeration order in this respect can be decreased.

Thanks again for clarifications,

Dmitry.

# Brendan Eich (13 years ago)

The es-discuss and es5-discuss lists have seen this topic before. I don't always remember names of people who sent messages, but I keep complete archives, search them locally, then search the web for a decent archive. Possibly others do the same and can provide links.

Here's a good one to read, from Kent Hansen of Nokia in 2009:

mail.mozilla.org/pipermail/es5-discuss/2010-February/003484.html

Included below (inline to avoid deletion from the web archive) is Kent's testcase.

Interoperation requires "own" or direct property enumeration order to be insertion order for non-Array objects, so browsers do that.

Differences for prototype properties were the topic of Kent's post.

Arrays that are sufficiently dense, or all Arrays, or even all indexed properties depending on implementation, may be enumerated in index numerical order.

With ES5, for-in enumeration order (whatever it is, according to the implementation) comes up in other places: 15.2.3.7 Object.defineProperties and 15.2.3.14 Object.keys.

I don't think ECMA-262 should continue to underspecify enumeration order. No competitive browser can afford to randomize just to wean the Web off of the de-facto standard partial order. So it seems to me we should try to write a normative spec everyone likes that is sufficiently web-compatible and sensible. Dave took a swing at this:

strawman:enumeration

/be

----- HTML-based test from kent.hansen at nokia.com ----- <html> <head> <script>

if (typeof Object.create !== 'function') { Object.create = function (o) { function F() {} F.prototype = o; return new F(); }; }

function write(str) { document.write(str); }

(function() { write("Enumeration order: "); var prot = {b:2}; var o = Object.create(prot); o.a = 1; var props = []; for (var p in o) props.push(p); write(props.join(" ")); write("<br>"); })();

(function() { write("Case I: "); var prot = {}; var o = Object.create(prot); o.a = 1; var props = []; for (var p in o) { if (p == "a") prot.b = 2; props.push(p); } write(props.join(" ")); write("<br>"); })();

(function() { write("Case II: "); var prot = {}; var o = Object.create(prot); o.a = 1; var props = []; for (var p in o) { if (p == "a") { delete o.a; prot.a = 2; } props.push(p); } write(props.join(" ")); write("<br>"); })();

(function() { write("Case III: "); var prot = {b:2}; var o = Object.create(prot); o.a = 1; var props = []; for (var p in o) { if (p == "a") o.b = 3; props.push(p); } write(props.join(" ")); write("<br>"); })();

(function() { write("Case IV: "); if (this.proto == undefined) { write("not testable (requires proto extension)<br>"); return; } var prot = {b:2}; var o = Object.create(prot); o.a = 1; var props = []; for (var p in o) { if (p == "a") o.proto = {c:3}; props.push(p); } write(props.join(" ")); write("<br>"); })();

</script> </head> <body> </body> </html>

# Brendan Eich (13 years ago)

On Dec 27, 2010, at 1:36 PM, Brendan Eich wrote:

The es-discuss and es5-discuss lists have seen this topic before. I don't always remember names of people who sent messages, but I keep complete archives, search them locally, then search the web for a decent archive. Possibly others do the same and can provide links.

Here's a good one to read, from Kent Hansen of Nokia in 2009:

mail.mozilla.org/pipermail/es5-discuss/2010-February/003484.html

Er, 2010. I think I was looking further back, but this is far enough given it's almost 2011!

# David-Sarah Hopwood (13 years ago)

On 2010-12-27 19:15, David Herman wrote:

Dave, under the spec for "Operation OwnProperties(obj)" step 1, you don't explicitly state that these index properties are to be enumerated in numeric order. An oversight?

Oops, yes, thanks for catching that. I've updated the wiki.

The given algorithm seems to order index-properties before non-index properties for each object on the prototype chain, rather than ordering all index-properties (for all objects on the chain) before all non-index properties. I'm not disagreeing with this choice, but was it intentional?

Nitpick: the '[ props, ...' notation could be misinterpreted as saying that the previous 'props' list is the first element of the new list. It should be something like 'props ++ [...'.

# Geoffrey Sneddon (13 years ago)

On 27/12/10 13:40, Andreas Gal wrote:

I can't speak for other VM implementors, but I think most implementors will not want to guarantee either enumeration method (property addition order vs numeric order). In some cases we store array data "dense", in others we punt and use a natural object representation. The decision when we switch between representations is a brittle heuristics. Allowing the underlying representation and along with it the enumeration order to bleed through is faster than normalizing (which in some cases would mean sorting property indexed names).

Both Carakan and V8 use numeric order for all uint32-named properties for all objects (even when sparse), and simply avoid using any heuristics at all by never changing between the two. This does cause some breakage, but currently neither ourselves nor V8 intend on changing to insertion order, preferring to use evangelism for what does break.

1 is the V8 bug for the behaviour; looking through then duplicates we have of the bug we have for our behaviour, almost all duplicates are bugs reported by web developers, not site-compatibility bugs.