iteration order for Object
Another idea might be to introduce OrderedObject or somesuch with guaranteed enumeration order. You might lose object literal support (although that could be readded), but that's the only loss that comes to mind after brief thought.
Please see strawman:enumeration.
On Mar 10, 2011, at 3:48 PM, Charles Kendrick wrote:
This seems to have been most recently discussed in 2009 with inconclusive results.
See thread starting at: esdiscuss/2010-December/012459
I have summarized the argument for this feature below - this argument has swayed others who were initially opposed.
I'm hoping we can get quick consensus that specifically Object iteration order should be preserved, without getting too bogged down in the details of specifying exactly what happens for Arrays, prototype chains, etc. Major stakeholder agreement on this one aspect should be enough to prevent any other vendors from shipping browsers that break sites, and get the Chrome bug for this re-instated.
But those details are exactly the situations that break interoperability.
In esdiscuss/2010-December/012469 I identified scenarios where you can expect to get interoperable enumeration order among all major objects: The object has no inherited enumerable properties The object has no array indexed properties No properties have been deleted No property has had its attributes modified or been changed from a data property to an accessor property or visa versa The distinction you make between Arrays and Objects isn't one that necessarily exist at the implementation level. Are you suggesting that for all objects other than Array instances that array indexed properties must enumerate in insertion order? Chrome isn't the only browser where that currently isn't true.
On 3/10/2011 5:03 PM, Allen Wirfs-Brock wrote:
But those details are exactly the situations that break interoperability.
This behavior was perfectly consistent across all browsers until Chrome 6. I think it's more appropriate to say that Chrome is not interoperable with thousands of sites than to define interoperable behavior based on a minority browser's very very recent break from a de-facto standard that stood for 15 years.
Further, Chrome is used principally by developers and auto-updates itself. If Chrome behavior were reverted today, the number of browsers out there with this behavior would trend to zero in probably months.
But the sites and applications will last many years. Just being practical here.
In esdiscuss/2010-December/012469 I identified scenarios where you can expect to get interoperable enumeration order among all major objects:
The object has no inherited enumerable properties The object has no array indexed properties No properties have been deleted No property has had its attributes modified or been changed from a data property to an accessor property or visa versa
Quite correct, but because of your #2, this is worst-of-both options behavior:
- It's good enough to preserve order for non-numeric keys only
This is an abysmal compromise, with the worst traits of each alternative. It requires browser vendors to implement order preservation, such that we don't get the minor optimization that's possible from not preserving order at all. At the same time, it requires that applications and frameworks deal with lack of order for numeric keys, which are very common: in the use case of mapping stored to displayed values, stored values are very often numeric.
The distinction you make between Arrays and Objects isn't one that necessarily exist at the implementation level.
Correct. You and I know that Objects and Arrays share most of their implementation. But don't you agree that, approaching the language as a beginner, it's a surprise that numeric indices behave differently from other indices on Object?
Are you suggesting that for all objects other than Array instances that array indexed properties must enumerate in insertion order? Chrome isn't the only browser where that currently isn't true.
No. I have no opinion on standardized iteration order on Array or on any object other than Object (Date, Regexp, et al). There never was any consistency there, and I have never seen a use case where it could conceivably be important that properties on eg a Regexp have a specific order. I think it would probably be best to leave these unstandardized, to give browser vendors maximum leeway to optimize.
I'd love to have a full set of collections for JavaScript, but just to clarify, adding a separate OrderedObject wouldn't address most of the issues I've raised:
-
it would be no help to JSON - there would still be the severe expressiveness and object allocation overhead I previously described
-
it would remain a surprising "gotcha" that numeric indices behave specially on Object
-
it would not address the backcompat issue with all the sites that depend on ordering
On 3/10/11 8:44 PM, Charles Kendrick wrote:
It requires browser vendors to implement order preservation, such that we don't get the minor optimization that's possible from not preserving order at all.
For what it's worth, not preserving order for numeric properties allows optimizations that are decidedly not "minor".
You can compare the performance of "fast" and "slow" arrays in Spidermonkey to see the difference.
Yes, great performance enhancements are available for Arrays - but this is not relevant because as I explicitly stated, I am not suggesting anything be clarified or standardized with respect to for..in order for Arrays.
People use Objects as classes, instances, "associative arrays" / Maps, etc. Numeric keys are a tiny minority and there would be no measurable performance gains for special treatment of such keys on Object.
However because frameworks have to deal with all possible keys, we end up with a much, much more expensive data structure that has to be used just because numeric keys are being treated specially.
This means that the real-world, application-level impact of not preserving order is slower applications.
On 3/10/11 9:00 PM, Charles Kendrick wrote:
People use Objects as classes, instances, "associative arrays" / Maps, etc. Numeric keys are a tiny minority and there would be no measurable performance gains for special treatment of such keys on Object.
You may want to read bugzilla.mozilla.org/show_bug.cgi?id=594655 and bugzilla.mozilla.org/show_bug.cgi?id=611423. People are
running into performance issues due to lack of such special treatment today.
Now maybe these people are just doing dumb things they shouldn't be doing, but that doesn't make the performance differences observed on those tests "not measurable".
However because frameworks have to deal with all possible keys, we end up with a much, much more expensive data structure that has to be used just because numeric keys are being treated specially.
I agree this is an issue. I just think you're underestimating the performance drag of preserving numeric property order for vanilla Objects.
Boris, this is why I also took care to mention that for..in iteration on Arrays should remain unordered, so that developers doing relatively obscure things like crypto evoting in JavaScript (the use case in the first bug) still have access to a dense-array implementation.
Best of both worlds: Object does what you would expect, Array has optimizations for obscure use cases.
Last point below.
== Objections and counter-arguments
- Array for..in iteration order has always been inconsistent across browsers
Yes, this is true. I am proposing only that Object preserves insertion order, not Array.
No developers or sites rely on Array for..in iteration order, since it was never consistent.
If Array for..in iteration continues to be unordered, any developer that cares about the tiny performance difference can use an Array to store non-numeric property/value pairs.
On 3/10/11 9:18 PM, Charles Kendrick wrote:
Boris, this is why I also took care to mention that for..in iteration on Arrays should remain unordered
What does this have to do with the post you're replying to?
so that developers doing relatively obscure things like crypto evoting in JavaScript (the use case in the first bug) still have access to a dense-array implementation.
The point is that the bignum library there is using vanilla objects, not arrays. And they're using numeric property names.
If Array for..in iteration continues to be unordered, any developer that cares about the tiny performance difference can use an Array to store non-numeric property/value pairs.
-
They're not doing that now, necessarily, and there's no indication that they'll start.
-
A factor of 6 is not a "tiny performance difference".
On 03/10/2011 06:11 PM, Boris Zbarsky wrote:
You may want to read bugzilla.mozilla.org/show_bug.cgi?id=594655 and bugzilla.mozilla.org/show_bug.cgi?id=611423. People are running into performance issues due to lack of such special treatment today.
Further to these points, it is possible that Firefox might change its property storage system to address bugs like these such that property enumeration order changes. This would be bugzilla.mozilla.org/show_bug.cgi?id=586842. I am not saying anything about how likely or unlikely this is. I haven't really started to research the problem or brainstorm about potential solution space yet. But we do see problems with arrays with non-index, non-length properties, and it's certainly possible that a fix which permits extra, non-indexed properties to be added to arrays without notably de-optimizing property storage, and subsequent property access, may affect property enumeration order.
On Thu, Mar 10, 2011 at 9:00 PM, Charles Kendrick <charles at isomorphic.com>wrote:
People use Objects as classes, instances, "associative arrays" / Maps, etc. Numeric keys are a tiny minority and there would be no measurable performance gains for special treatment of such keys on Object.
An associative array is typically a hash map or perhaps a tree map. A typical implementation will either iterate through such values in either an undetermined order or in order by the keys -- preserving the insertion order would be more expensive and preclude most options for implementation.
If you care about order, you don't use a hash map, and a JS object literal seems more closely related to a hash map than anything else.
An alternative you didn't consider in your original post is using a single array, which is guaranteed to not change the order and never pick up additional properties:
selectControl.setOptions([ "storedValue1", "displayValue1", "storedValue2", "displayValue2", "storedValue3", "displayValue3" ])
Thanks for the link Brendan.
This strawman continues to treat numeric indices specially on Object.
Let's ignore, for the moment, the fact that the underlying implementation of Object and Array is very similar.
Has a use case been identified where treating numeric properties differently from other properties is a desirable behavior?
Is there any programming language or library for any programming language where this behavior has been intentionally implemented as a collection class?
In my limited experience both are "no".
Harmony is a clean break: seems like a great opportunity to implement the behavior for Object for..in iteration that is most useful.
Boris, compare:
- tens of thousands of web applications that need to define a sorted map plus perhaps billions of JSON messages per day
.. to ..
- a handful of crypto / computational use cases used by a tiny minority of sites
What should be optimized for?
Note that we don't really even have to choose. If you tell the guys implementing these crypto / bignum libraries that their code is going to run 6x faster in Firefox if they use an Array, they'll probably have switched by Tuesday.
It's a perfectly reasonable and acceptable way to close a bug to say that if you want the best performance when using lots of numeric indices, use an Array.
On 3/10/2011 6:50 PM, John Tamplin wrote:
If you care about order, you don't use a hash map, and a JS object literal seems more closely related to a hash map than anything else.
Object behaved like a LinkedHashMap for 15 years, and still does behave like a LinkedHashMap, even in Chrome, except for numeric keys.
It seems like a very broadly held perception that Objects behave this way (based on the record-setting 127 stars on Chrome's issue for this, if nothing else).
An alternative you didn't consider in your original post is using a single array, which is guaranteed to not change the order and never pick up additional properties:
selectControl.setOptions([ "storedValue1", "displayValue1", "storedValue2", "displayValue2", "storedValue3", "displayValue3" ])
I'm aware, but omitted it because I thought it was an even worse option.
It has all the allocation / GC drawbacks of the other approaches mentioned: two Strings per property vs just slots. It also retains the drawback that developers have to build a secondary index to avoid O(n) property change costs, and that this will be slower in practice, for real applications.
On top of this, and perhaps worst of all, it has the further disadvantage that it looks like a list of values. You can't look at the code and see that values are being mapped to one another.
On Mar 10, 2011, at 6:50 PM, Charles Kendrick wrote:
Let's ignore, for the moment, the fact that the underlying implementation of Object and Array is very similar.
Not true in all modern optimizing VMs.
Has a use case been identified where treating numeric properties differently from other properties is a desirable behavior?
Yes. Users occasionally create arrays empty and insert elements in reverse-index or other order. Then for-in discloses the insertion order if the array is deoptimized to be like an object, or else enumerates in index order if the array manages to be optimized.
Implementations are allowed to leak their optimization strategies via for-in, by explicit language in ES1-5.
So users complain, and almost always want for-in over an array to use index order.
If I'm reading you right, you don't mind us specifying arrays enumerating indexed "own" properties first and in index order.
But it gets worse: several notable JS libraries, including IIRC jQuery (the #1 library in popularity right now), use indexed properties on Object instances -- frequently. I'll try to dig up the references.
Those VMs that optimize indexed properties on Object to be (a) fast; (b, and therefore) enumerated in index order, do better on such code. VMs that optimize only for certain "dense-enough arrays" lose.
This creates pressure to follow the performance leader and treat indexed properties the same, in terms of enumeration, on all objects including arrays.
Is there any programming language or library for any programming language where this behavior has been intentionally implemented as a collection class?
Not really relevant based on the pressures in the ecosystem at hand.
In my limited experience both are "no".
Harmony is a clean break: seems like a great opportunity to implement the behavior for Object for..in iteration that is most useful.
Harmony is not a clean break. I don't know why you wrote that.
Harmony has some room to break compatibility, especially if the old forms are caught by early errors. I've spoken of one hand's worth of fingers to count these breaks (the "five fingers of fate"). The valuable and opposable thumb is for removing the global object from the scope chain. The index finger on my hand is paren-free relaxation of syntax combined with better for-in semantics. The middle finger may as well go for typeof null == "null" (which can't be caught by any early error in a practical implementation; testing and static analysis tools needed). And so on, but for only two more fingers (on my hand; others on TC39 may differ on particulars, and we all may have another hand behind our back with back-up candidates).
Harmony is not "anything goes", VB7 ("Visual Fred"), Cobol2000, or even Perl 6 vs. 5 (I'm not saying anything bad about Perl 6 here -- I like that it makes many and large changes based on lessons learned). Harmony must be mostly compatible with JS as it is today so that code can migrate forward.
On Mar 10, 2011, at 7:18 PM, Brendan Eich wrote:
Harmony is not a clean break. I don't know why you wrote that.
Perhaps Charles meant "a clean break in for-in semantics".
That's what I propose via my index-finger compatibility break:
Harmony has some room to break compatibility, especially if the old forms are caught by early errors. I've spoken of one hand's worth of fingers to count these breaks (the "five fingers of fate"). The valuable and opposable thumb is for removing the global object from the scope chain. The index finger on my hand is paren-free relaxation of syntax combined with better for-in semantics.
The idea (brendaneich.com/2010/11/paren-free, followups in brendaneich.com/2011/01/harmony-of-my-dreams) is to relax the grammar so that parentheses may be omitted from statement heads where the parenthesized form is an expression, but to remove parens around the for loop heads (a la Go, for the three-part for ;; head). This requires that the body or then-clause be braced unless it is a simple statement that starts with an unconditionally reserved identifier.
The benefit for code migration is that for (x in o) ...; code will not compile in such a Harmony. Migrators will have to rewrite and (if necessary) choose a custom iterator to get the desired order and values or keys: for x in keys(o) {...;}, e.g. where keys is imported from a standard enumerators module.
This is less likely to break code than a subtle, runtime-only, and therefore "unclean" break in the meaning of for (x in o).
On 3/10/11 9:58 PM, Charles Kendrick wrote:
- tens of thousands of web applications that need to define a sorted map plus perhaps billions of JSON messages per day
.. to ..
- a handful of crypto / computational use cases used by a tiny minority of sites
What should be optimized for?
It depends on the relative slowdowns, possibly.
And to be clear, all I was pointing out is that the speedup from treating numbered properties specially is very noticeable and appears on real-life workloads. Where we go from there is a separate issue.
Note that we don't really even have to choose. If you tell the guys implementing these crypto / bignum libraries that their code is going to run 6x faster in Firefox if they use an Array, they'll probably have switched by Tuesday.
I told them in September. There's been no change yet. I think you overestimate how much people are willing to change their code to work around what they think are bugs....
On 3/10/2011 7:18 PM, Brendan Eich wrote:
On Mar 10, 2011, at 6:50 PM, Charles Kendrick wrote:
Has a use case been identified where treating numeric properties differently from other properties is a desirable behavior?
If I'm reading you right, you don't mind us specifying arrays enumerating indexed "own" properties first and in index order.
Yes, exactly. I understand the performance advantage here and I think Arrays should either do exactly what's in the strawman or even not have a standardized iteration order (for maximum optimization leeway).
But it gets worse: several notable JS libraries, including IIRC jQuery (the #1 library in popularity right now), use indexed properties on Object instances -- frequently. I'll try to dig up the references.
Those VMs that optimize indexed properties on Object to be (a) fast; (b, and therefore) enumerated in index order, do better on such code. VMs that optimize only for certain "dense-enough arrays" lose.
This creates pressure to follow the performance leader and treat indexed properties the same, in terms of enumeration, on all objects including arrays.
You're correct, JQuery returns most query results as an Object where each matching element is placed at a numeric index and it also does a lot of subsetting and traversal on such objects.
However, I doubt very much that the effect on JQuery of dense arrays could be shown outside of synthetic benchmarks and very very niche use cases. References would be great if you can find them.
However large the effect on JQuery, I expect that a benchmark of a JavaScript LinkedHashMap implementation as compared to native speed would be, I would guess, something like a 25x advantage.
Further, note that JQuery's use of Object is an implementation detail that may change in the future. Big chunks of JQuery are now shims for old browsers to duplicate HTML5/CSS3 native support, and libraries are emerging that are roughly JQuery-compatible but directly return an augmented native NodeList instead of an Object with indices:
http://chocolatechipmobile.wordpress.com/
JQuery may well adopt this just as it adopted Sizzle, so it would be a shame to optimize the language for such a short-lived advantage.
Harmony is not a clean break. I don't know why you wrote that.
Quite right Brendan, sorry about that, I understand that it's designed as a more minimal break. Thanks for giving me the fingers ;)
2011/3/11 Charles Kendrick <charles at isomorphic.com>:
All browsers that have ever had non-negligible market share have implemented order-preserving Objects - until Chrome 6.
Just to be clear: Chrome 5 and before had a for-in ordering that revealed internal optimization strategies. From Chrome 6 and forward the behaviour was made consistent. There was never a version of Chrome that consistently iterated numeric-keyed properties in insertion order.
Le 11/03/2011 00:48, Charles Kendrick a écrit :
I believe it is very very important that the ECMAScript standard specify that when a new Object is created, for..in iteration traverses properties in the order they are added, regardless of whether the properties are numeric or not. (...) == Expressiveness and Performance argument
A very common use case where order preservation is desirable is providing the set of options for a drop-down list in JavaScript. Essentially all Ajax widget kits have such an API, and usage is generally:
selectControl.setOptions({ storedValue1 : "displayValue1", storedValue2 : "displayValue2", storedValue3 : "displayValue3" })
This is one (very valid, intuitive and common) use case. What if I want to implement some sort of dictionnary and want keys to be ordered in alphabetic order? If the spec imposes one order, then people with other use cases have additional work. Since so far, the spec has never said anything about property ordering, it would create some biase to add to the spec one specific order.
The problem you're pointing at is that you want ordered key->value maps
and you want this to be objects (which are already unordered key->value
maps). It appears to me that ES Objects haven't been designed to be ordered so far. ES5.1 - 4.2: "An ECMAScript object is a collection of properties".
To have ordered key->value maps, possible solutions are:
- Change the spec
- Add an OrderedObject as Jeff Walden suggested. This doesn't even need to be added to the spec since proxies allow to implement them. It however doesn't solve the JSON use case, I agree.
- Play with value proxies (strawman:value_proxies). Maybe that these could allow to redefine JSON objects initialisation rules in order to specify an order (or return an OrderedObject). I'm not a big fan of this last idea but it could work.
I'd like to clarify the use case you're pointing. Do you have something like:
/* some sort of meta-JS / selectControl.setOptions = function(keyValMap){ foreach(key => val){ / doesn't exist in JS. Change depending on the
keyValMap implementation */ this.addOption(key, val); // key and val are strings } }
selectControl.setOptions(objectOrOneAlternateSolution)
The point I am trying to make is that having keys and values as strings is required by the DOM API in order to create the option elements, so when you mention "2 Strings per property", this is work that has to be done anyway. This might even be cheaper to have strings created from parsing the source than "extracted" as object property names.
Then, I am surprised when you point an issue with linear search. Why would you want to search for properties if your use case is to use the implicit order of object initialisation in order to make option elements? What is your exact use case? Regardless, I agree with the verbosity of alternative solutions.
If I understand well, you're asking for objects to be indexed by property names and iterated by addition order. In my opinion, this is asking a lot if we also want performance.
I believe it is very very important that the ECMAScript standard specify that when a new Object is created, for..in iteration traverses properties in the order they are added, regardless of whether the properties are numeric or not.
Some users might prefer 'in the order of keys'. That is predictable, and allows for efficient implementation (both sparse and dense).
A very common use case where order preservation is desirable is providing the set of options for a drop-down list in JavaScript.
Essentially all Ajax widget kits have such an API, and usage is generally:selectControl.setOptions({ storedValue1 : "displayValue1", storedValue2 : "displayValue2", storedValue3 : "displayValue3" })
Here are some examples of what alternatives might look like - all of them are far, far worse: ..
Most of these are just awkward ways of saying "this is the order I want" and "I also want hashmap access". So why not write that out explicitly, with an optional ordering parameter, making the enumeration explicit when the default ordering isn't suitable:
selectControl.setOptions({
storedValue1 : "displayValue1",
storedValue2 : "displayValue2",
storedValue3 : "displayValue3"
},['storedValue1','storedValue2','storedValue3'])
It may seem convenient if keys happen to be enumerated in the order they are written, but if that extends to the order of insertion, things get complicated (if you delete an entry, then reinsert it, has it lost "its" position? do you want an "insert before"?).
Ordering by keys would avoid this problem: it wouldn't matter how or when the entries came to be in the object, just which keys there are in the object.
Also, there are typically other bits of information you cannot embed implicitly (keyboard shortcuts, separators, icons,..), so you will have to switch to an explicit representation anyway, even if your use case is almost partially covered by one popular, but not guaranteed, interpretation of the spec!-)
If you're worried about allocation, constant strings can be commoned up in the implementation. (*)
Claus
(*) Is there any overview of the optimizations typically done in most major Javascript implementations? I'm not so much asking about "what X does better than Y" but about "do I still need to worry about this, or will any decent implementation take care of it?" kind of questions).
On Fri, Mar 11, 2011 at 10:07 AM, Claus Reinke <claus.reinke at talk21.com>wrote:
I believe it is very very important that the ECMAScript standard specify
that when a new Object is created, for..in iteration traverses properties in the order they are added, regardless of whether the properties are numeric or not.
Some users might prefer 'in the order of keys'. That is predictable, and allows for efficient implementation (both sparse and dense).
Are you suggesting changing the enumeration order which is currently implemented as "iteration order matches insertion order" under the careful set of conditions Allen enumerated earlier in this thread? (e.g. no enumerable props on prototype).
If so - the thought of this makes me pretty nervous. There is undoubtedly significant amounts of code in the wild on the web which depend on the current enumeration order, and changing this would increase the size of the harmony-uptake tax (code audits and refactors).
Someone -- Mark Miller? -- suggested an interesting option when this discussion came up last on this list (around Christmas 2010 IIRC). Basically -- enumerate named props in insertion order, and numeric props in numeric. This gets pretty close to what most developers seem to expect, while leaving the door wide open for fast implementation of array-like objects.
Most of these are just awkward ways of saying "this is the order I want" and "I also want hashmap access". So why not write that out explicitly, with an optional ordering parameter, making the enumeration explicit when the default ordering isn't suitable:
You know, most of the time when I see valid use-cases for an alternate enumeration order, I can't help but think to myself: this might have a better solution if ES had generators, something like what Mozilla prototyped in JavaScript 1.7, and you could make such a generator the enumeration hook for the object in question.
On 3/11/2011 7:07 AM, Claus Reinke wrote:
I believe it is very very important that the ECMAScript standard specify that when a new Object is created, for..in iteration traverses properties in the order they are added, regardless of whether the properties are numeric or not.
Some users might prefer 'in the order of keys'. That is predictable, and allows for efficient implementation (both sparse and dense).
This:
-
breaks completely with the de-facto standard.
-
still requires the verbose, high allocation cost data formats previously explained in order to define an in-order map
-
prevents the use of Object literals in JSON to convey ordered data
A SortedMap collection for JavaScript would be great, and useful for certain types of code. But it doesn't help with the problems I'm pointing out.
Most of these are just awkward ways of saying "this is the order I want" and "I also want hashmap access". So why not write that out explicitly, with an optional ordering parameter, making the enumeration explicit when the default ordering isn't suitable:
selectControl.setOptions({ storedValue1 : "displayValue1", storedValue2 : "displayValue2", storedValue3 : "displayValue3" },['storedValue1','storedValue2','storedValue3'])
Because this is spectacularly bad in all the ways I previously mentioned: even more redundancy than the prior worst option, even more allocation and GC load.
It may seem convenient if keys happen to be enumerated in the order they are written, but if that extends to the order of insertion, things get complicated (if you delete an entry, then reinsert it, has it lost "its" position? do you want an "insert before"?).
There are several clear behaviors to choose from here (eg Java's LinkedHashMap).
On 3/11/2011 7:35 AM, Wes Garland wrote:
Someone -- Mark Miller? -- suggested an interesting option when this discussion came up last on this list (around Christmas 2010 IIRC). Basically -- enumerate named props in insertion order, and numeric props in numeric. This gets pretty close to what most developers seem to expect, while leaving the door wide open for fast implementation of array-like objects.
Just connecting the dots - I addressed this in my first email on the subject. While it superficially sounds like a good compromise, I actually think it's the worst possibility: it requires browser vendors to implement limited order preservation, preventing deeper optimizations like sorted keys. At the same time, it requires that applications and frameworks deal with lack of order for numeric keys, which are very common: in the use case of mapping stored to displayed values, stored values are very often numeric.
I also think that it's surprising and counter-intuitive that numeric keys are treated differently from non-numeric. The reality is that an implementation detail of Array is bleeding through to Object.
This whole discussion makes reminds me how much JavaScript needs proper collections. People use "Object" but they don't really want Object (where prototype properties leak into data, where they String is the only key type, where the strings "1.0" and "1" can not be represented in the same map, etc) they want a HashMap, a LinkedHashMap, a Set, etc.
Le 11/03/2011 17:13, John Lenz a écrit :
This whole discussion makes reminds me how much JavaScript needs proper collections. People use "Object" but they don't really want Object (where prototype properties leak into data, where they String is the only key type, where the strings "1.0" and "1" can not be represented in the same map, etc) they want a HashMap, a LinkedHashMap, a Set, etc.
Proxies will allow to implement all of that and more.
Back to the initial use case, the only thing proxies do not allow to do is capturing property order of object literals.
On 3/11/2011 2:39 AM, David Bruant wrote:
Le 11/03/2011 00:48, Charles Kendrick a écrit :
== Expressiveness and Performance argument
A very common use case where order preservation is desirable is providing the set of options for a drop-down list in JavaScript. ... This is one (very valid, intuitive and common) use case. What if I want to implement some sort of dictionnary and want keys to be ordered in alphabetic order? If the spec imposes one order, then people with other use cases have additional work. Since so far, the spec has never said anything about property ordering, it would create some biase to add to the spec one specific order.
If the spec doesn't impose an order, everybody has extra work because there's no default strategy that can be relied upon.
I have no problem with multiple strategies being available, as either a set of collection classes similar to Java Collections, or as strategy hints to Objects.
However as far as the default strategy, the highest value thing to do seems to me to impose the de-facto standard of 15 years - insertion order - which is a very useful behavior and will avoid thousands of websites having to compensate for a change in de-facto standard behavior.
The point I am trying to make is that having keys and values as strings is required by the DOM API in order to create the option elements, so when you mention "2 Strings per property", this is work that has to be done anyway. This might even be cheaper to have strings created from parsing the source than "extracted" as object property names.
No - the difference is, if you eval a JSON Object, you either have 2 JavaScript Strings per property (or even more overhead than that in some scenarios), or you have a single JavaScript Object in which only the property values actually exist as Strings - the slots are implicit, only existing inside the underlying VM. This has less memory footprint, less allocation overhead, less GC overhead. It is far far faster to remove or replace keys with this structure as well.
Whether this is used to render a DOM as well, or for something else, can be treated as a separate issue. Already, you have this tremendous efficiency at the data structure level, and you've preserved JSON's ability to encode an ordered object.
On Mar 10, 2011, at 5:44 PM, Charles Kendrick wrote:
This behavior was perfectly consistent across all browsers until Chrome 6. I think it's more appropriate to say that Chrome is not interoperable with thousands of sites than to define interoperable behavior based on a minority browser's very very recent break from a de-facto standard that stood for 15 years.
Note that it isn't just Chrome. IE9 also has similar behavior (at least in its late platform previews, I don't have the RC installed yet)
On IE9 preview 1.9.8023.6000 standards mode:
var a={x:'x', 3:3, 2:2,1:1}; a.a='a'; print(Object.keys(a)); //--> 1,2,3,x,a
print(JSON.stringify(a); //--> {"1":1,"2":2,"3":3,"x":"x","a":"a"}
var k=''; for (var e in a) k+=e+" "; print(k); //--> 1 2 3 x a
On Fri, Mar 11, 2011 at 11:31 AM, Charles Kendrick <charles at isomorphic.com>wrote:
However as far as the default strategy, the highest value thing to do seems to me to impose the de-facto standard of 15 years - insertion order - which is a very useful behavior and will avoid thousands of websites having to compensate for a change in de-facto standard behavior.
So I suppose you think C should have kept int at 16 bits since there was lots of Win16 code that assumed sizeof(int)==2 because it happened to work on their platform, or likewise sizeof(int)==sizeof(char*)? Things unspecified in the spec mean "unspecified" -- it doesn't mean "rely on whatever behavior the implementation you use exhibits and expect it to be portable".
Hello John, I'll assume you meant this as humor since the analogy has such obvious flaws.
Having a default strategy on Object of maintaining order obviously does not preclude other strategies, nor does it damage the JavaScript language itself, as locking int to 16 bits would obviously have damaged C by requiring various new types.
On Fri, Mar 11, 2011 at 2:49 PM, Charles Kendrick <charles at isomorphic.com>wrote:
Hello John, I'll assume you meant this as humor since the analogy has such obvious flaws.
Having a default strategy on Object of maintaining order obviously does not preclude other strategies, nor does it damage the JavaScript language itself, as locking int to 16 bits would obviously have damaged C by requiring various new types.
There is a non-zero cost of maintaining insertion order, and doing so introduces many edge cases that have been discussed. The most obvious implementation of object properties is a hash map, which does not support what you want.
Aside from the technical issues, the point remains that if you write code that depends on unspecified implementation details, you should not expect that code to be portable.
I think analogy with C is appropriate -- the sizes and implementation details of basic types were left unspecified, largely because specifying a particular size or representation would have made it inefficient to implement on some platforms. Sure, that meant that people had to define their own int16/int32/etc types where they cared and certainly some people wrote code assuming twos-complement or int/pointer equivalence and were surprised when the code didn't run on some other platform, but it also allowed the language to be efficiently implemented on lots of different platforms and to grow to platforms never imagined when it was first designed.
you seem to be rephrasing your case without addressing my counterpoints.
So again the problem with the analogy to C is that permanently changing the definition of C so that "int" is 16 bits would damage the language, my proposal would not.
Further, I have argued in detail that it improves the language - you're addressing solely the backcompat issue as if that were the whole of the argument: it is not.
And again, I find your notion that an Object is "obviously" a HashMap very suspect. It has never been in practice; there are very large benefits to having it remain order-preserving; thousands of developers assumed the behavior would be standardized because it was so obviously useful and so consistently implemented. So it is clearly not "obvious", then, that it should be a hash map.
On 3/10/2011 7:33 PM, Boris Zbarsky wrote:
On 3/10/11 9:58 PM, Charles Kendrick wrote:
- tens of thousands of web applications that need to define a sorted map plus perhaps billions of JSON messages per day
.. to ..
- a handful of crypto / computational use cases used by a tiny minority of sites
What should be optimized for?
It depends on the relative slowdowns, possibly.
Billions of JSON messages vs a handful of sites should be pretty clear cut, but I've collected some numbers anyway just to make it more obvious.
Below are two partial implementations of LinkedHashMap in JavaScript, with test code to populate with lots of keys and remove half the keys at random, then alert() the results. Both implementations add and maintain an index to provide O(1) lookups.
Each is compared to the speed of the equivalent use of Object assuming it supports ordering, which is just to use the JavaScript "delete" operator to remove keys. My numbers are from Firefox 3.6.15 on a WinXP and a MacOSX laptop I have here.
The first is a straightforward implementation where insert and removal use splice() to modify an Array. This is what most developers would probably implement. This is approximately 800-1000 times slower than a native Object.
The second leverages native speed better by using two Objects to store the forward and reverse pointers of a classic linked list. I'll stipulate that everyone on this list would arrive at this implementation, but in my opinion a vanishing small number of JavaScript programmers at large would figure it out.
Even so it's 8-10x slower than native.
So if you will stipulate that having an LinkedHashMap is at least as common a JavaScript use case as in-memory crypto for evoting, it's pretty clear where the most bang for the buck is, even before you consider all the JSON messages that could be simplified if order is preserved.
Note that we don't really even have to choose. If you tell the guys implementing these crypto / bignum libraries that their code is going to run 6x faster in Firefox if they use an Array, they'll probably have switched by Tuesday.
I told them in September. There's been no change yet. I think you overestimate how much people are willing to change their code to work around what they think are bugs....
To be fair, what you told them was that you considered it a bug in Firefox that their code was still slower in Firefox than in Chrome. So of course they would not change their code.
== first partial implementation: order maintained via an array
(function () {
var sourceDataObj = {}; var sourceDataArr = []; var keysToRemove = [];
for (var i = 0; i < 30000; i++) { sourceDataObj["key" + i] = "value" + i; sourceDataArr[sourceDataArr.length] = "key" + i; sourceDataArr[sourceDataArr.length] = "value" + i; if (Math.random(1) > 0.5) { keysToRemove[keysToRemove.length] = "key" + i; } }
var orderedMap = { init : function (data) { this.data = data;
this.index = {};
for (var i = 0; i < data.length; i++) {
this.index[data[i]] = data[i+1];
i++;
}
},
get : function (key) {
return this.index[key];
},
remove : function (key) {
var arrIndex = this.data.indexOf(key);
this.data.splice(arrIndex, 2);
delete this.index[key];
}
};
var start = new Date().getTime(); orderedMap.init(sourceDataArr);
for (var i = 0; i < keysToRemove.length; i++) { orderedMap.remove(keysToRemove[i]); } alert("orderedMap: " + (new Date().getTime() - start));
var start = new Date().getTime(); for (var i = 0; i < keysToRemove.length; i++) { delete sourceDataObj[keysToRemove[i]]; } alert("object: " + (new Date().getTime() - start));
}());
== second partial implementation: linked list via dual Objects
(function () {
var sourceDataObj = {}; var sourceDataArr = []; var keysToRemove = [];
for (var i = 0; i < 900000; i++) { sourceDataObj["key" + i] = "value" + i; sourceDataArr[sourceDataArr.length] = "key" + i; sourceDataArr[sourceDataArr.length] = "value" + i; if (Math.random(1) > 0.5) { keysToRemove[keysToRemove.length] = "key" + i; } }
var orderedMap = { init : function (sourceData) { this.data = {}; this.nextKeys = {}; this.priorKeys = {}; for (var i = 0; i < sourceData.length; i++) { var key = sourceData[i], value = sourceData[i+1]; this.put(key, value); i++; } }, get : function (key) { return this.data[key]; }, put : function (key, value) { this.data[key] = value;
if (this.firstKey == null) this.firstKey = key;
if (this.nextKeys[key]) {
// entry exists: not implemented
} else {
// new insertion
if (this.lastKey == null) {
// empty list
this.lastKey = key;
} else {
// new last key
this.nextKeys[this.lastKey] = key;
this.priorKeys[key] = this.lastKey;
this.lastKey = key;
}
}
},
remove : function (key) {
delete this.data[key];
var keyBefore = this.priorKeys[key],
keyAfter = this.nextKeys[key];
this.nextKeys[keyBefore] = keyAfter;
this.priorKeys[keyAfter] = keyBefore;
if (this.firstKey == key) this.firstKey = keyAfter;
if (this.lastKey == key) this.lastKey = keyBefore;
delete this.nextKeys[key];
delete this.priorKeys[key];
},
getKeys : function () {
var nextKey = this.firstKey,
keys = [this.firstKey];
while ((nextKey = this.nextKeys[nextKey]) != null) {
keys[keys.length] = nextKey;
}
return keys;
}
}; var start = new Date().getTime(); orderedMap.init(sourceDataArr);
for (var i = 0; i < keysToRemove.length; i++) { orderedMap.remove(keysToRemove[i]); } alert("orderedMap: " + (new Date().getTime() - start));
var start = new Date().getTime(); for (var i = 0; i < keysToRemove.length; i++) { delete sourceDataObj[keysToRemove[i]]; } alert("object: " + (new Date().getTime() - start));
}());
On Fri, Mar 11, 2011 at 3:14 PM, Charles Kendrick <charles at isomorphic.com>wrote:
On 3/10/2011 7:33 PM, Boris Zbarsky wrote:
On 3/10/11 9:58 PM, Charles Kendrick wrote:
- tens of thousands of web applications that need to define a sorted map plus perhaps billions of JSON messages per day
.. to ..
- a handful of crypto / computational use cases used by a tiny minority of sites
What should be optimized for?
It depends on the relative slowdowns, possibly.
Billions of JSON messages vs a handful of sites should be pretty clear cut, but I've collected some numbers anyway just to make it more obvious.
You've asserted this JSON-advantage claim several times now so you should probably know that keys in JSON are specifically specified as unordered. Nothing TC39 does can alter this fact -- it won't (and can't) be changed. Even if ECMAScript were to go this route, nothing says that JSON implementations will (or must) respect this ordering. Surprises would lurk around every corner.
[snipped the rest]
Not so - order-preserving implementations are backwards compatible with non-order-preserving implementations. Just rev the spec, and like any other versioned spec, developers can use the new behavior when they know the application environment uses only the new version.
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked":
-
we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through
-
the majority of real-world applications will be slowed down because developers will need to re-implement the very commonly needed LinkedHashMap behavior in JavaScript
-
developers will waste a bunch of time doing this when the language could have provided it
-
developers will waste a bunch of time upgrading old sites and applications
-
JavaScript code will forever be more verbose than it needs to be wherever ordered maps are needed
-
JSON messages will forever be more verbose than they need to be, and result in larger JavaScript allocation and GC loads when eval()d/parsed
Against all this, we have various misunderstandings (thinking the iteration order of Array is being affected, etc), but finally only:
-
for certain unusual use cases, implementers of code that use lots of integer indexes will need to use Arrays for best performance - a perfectly reasonable optimization technique to require, in my opinion
-
there may, in the immediate term only, for some unspecified use cases which may well be very synthetic, be an advantage for JQuery, which may well vanish as JQuery evolves
On Fri, Mar 11, 2011 at 3:48 PM, Charles Kendrick <charles at isomorphic.com>wrote:
Not so - order-preserving implementations are backwards compatible with non-order-preserving implementations. Just rev the spec, and like any other versioned spec, developers can use the new behavior when they know the application environment uses only the new version.
The JSON spec has no version number, intentionally -- it would have to be supersetted entirely. You could make the argument that perhaps it's about time (I CAN HAZ DATES?) but that's a much bigger challenge, and your claimed advantages to JSON handling would still be moot until ECMAScript adopted your JSON replacement.
Probably the language which most commonly handles JSON is JavaScript itself. So a major chunk of the benefit, possibly even the majority of the benefit, would be immediate: eval(), application config files expressed in JSON, etc.
And as you alluded to (also I CAN HAZ CUSTIM CLAZZES?) JSON will probably eventually have versions.
Le 11/03/2011 21:49, Charles Kendrick a écrit :
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked":
Let's face it right now: IE9 will ship that way. They're on RC phase, it's completely irrealistic to consider they would change object implementation.
From the spec point of view, there is no urgency since Harmony won't
ship for couple of years I think.
- we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through
As said by someone else, the iteration order has never been standardized. It was dangerous from the developers to base their code on an implementation detail even though this one was at some point consistent accross browsers. Standards aren't only for implementors. For the record also, a for..in loop looks into the prototype. It you have an object and shadow a prototype property, how is the order affected ? If for two objects o1 and o2 which share the same prototype, how are the prototype properties iterated? what happens when you play with prototype properties and object properties afterward?
Fixing for..in loops is a dead-end in my opinion.
It doesn't stop the discussion for own properties.
the majority of real-world applications will be slowed down because developers will need to re-implement the very commonly needed LinkedHashMap behavior in JavaScript
developers will waste a bunch of time doing this when the language could have provided it
I agree on 2 and 3.
- developers will waste a bunch of time upgrading old sites and applications
They shouldn't have relied on a non-standard behavior in the first place. They spared time in not taking the time to know the standard, this comes at a later cost.
JavaScript code will forever be more verbose than it needs to be wherever ordered maps are needed
JSON messages will forever be more verbose than they need to be, and result in larger JavaScript allocation and GC loads when eval()d/parsed
Apparently, JSON has been standardized as unordered. Maybe that the good decision if the order really matters to you is to use another interchange format. Interchange formats are created with a purpose in mind, with use cases. Maintaining an order comes with an in-memory cost and a cost from the serialization implementation point of view. JSON was certainly thought to not have this cost. Is Douglas Crockford around to confirm?
If your use case doesn't fit the interchange format use cases, change of interchange format (XML fits your need even though it's a bit more verbose). Or create one?
The idea of using value proxies (strawman:value_proxies) to override object initialization syntax sounds like a good compromise to keep unordered objects as they are now and being able to fix backward compatibility. It will require a bit of work, but it sounds realistic.
On 3/11/2011 1:33 PM, David Bruant wrote:
Le 11/03/2011 21:49, Charles Kendrick a écrit :
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked": Let's face it right now: IE9 will ship that way. They're on RC phase, it's completely irrealistic to consider they would change object implementation. From the spec point of view, there is no urgency since Harmony won't ship for couple of years I think.
I disagree - if there is a clear consensus that ECMAScript will standardize on a particular iteration order, there's a strong chance IE9 will update to reflect this (so, on reflection, I should not have said the goose is cooked).
They are aggressively embracing standards across the board now, after all.
- we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through As said by someone else, the iteration order has never been standardized. It was dangerous from the developers to base their code on an implementation detail even though this one was at some point consistent accross browsers. Standards aren't only for implementors.
Your perspective is common in a group like this - very spec and standard focused. Isn't it fun to bash those developers? Everyone's doing it.. I hope you realize it's irrelevant though?
These developers took a calculated risk at a time when standards were so vague and partial that they had to take similar risks everywhere. Most of them stand ready to accept the consequences and change their code.
It's just that it's a tremendous waste of time for them to do so. That's the point.
2011/3/11 David Bruant <bruant at enseirb-matmeca.fr>:
Le 11/03/2011 21:49, Charles Kendrick a écrit :
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked": Let's face it right now: IE9 will ship that way. They're on RC phase, it's completely irrealistic to consider they would change object implementation. From the spec point of view, there is no urgency since Harmony won't ship for couple of years I think.
- we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through As said by someone else, the iteration order has never been standardized. It was dangerous from the developers to base their code on an implementation detail even though this one was at some point consistent accross browsers. Standards aren't only for implementors. For the record also, a for..in loop looks into the prototype. It you have an object and shadow a prototype property, how is the order affected ? If for two objects o1 and o2 which share the same prototype, how are the prototype properties iterated? what happens when you play with prototype properties and object properties afterward?
Fixing for..in loops is a dead-end in my opinion.
It doesn't stop the discussion for own properties.
the majority of real-world applications will be slowed down because developers will need to re-implement the very commonly needed LinkedHashMap behavior in JavaScript
developers will waste a bunch of time doing this when the language could have provided it I agree on 2 and 3.
- developers will waste a bunch of time upgrading old sites and applications They shouldn't have relied on a non-standard behavior in the first place. They spared time in not taking the time to know the standard, this comes at a later cost.
JavaScript code will forever be more verbose than it needs to be wherever ordered maps are needed
JSON messages will forever be more verbose than they need to be, and result in larger JavaScript allocation and GC loads when eval()d/parsed Apparently, JSON has been standardized as unordered. Maybe that the good decision if the order really matters to you is to use another interchange format. Interchange formats are created with a purpose in mind, with use cases. Maintaining an order comes with an in-memory cost and a cost from the serialization implementation point of view. JSON was certainly thought to not have this cost. Is Douglas Crockford around to confirm?
From www.ietf.org/rfc/rfc4627.txt :
An object is an unordered collection of zero or more name/value pairs, where a name is a string and a value is a string, number, boolean, null, object, or array.
Le 11/03/2011 23:07, Charles Kendrick a écrit :
On 3/11/2011 1:33 PM, David Bruant wrote:
Le 11/03/2011 21:49, Charles Kendrick a écrit :
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked": Let's face it right now: IE9 will ship that way. They're on RC phase, it's completely irrealistic to consider they would change object implementation. From the spec point of view, there is no urgency since Harmony won't ship for couple of years I think.
I disagree - if there is a clear consensus that ECMAScript will standardize on a particular iteration order, there's a strong chance IE9 will update to reflect this (so, on reflection, I should not have said the goose is cooked).
They are aggressively embracing standards across the board now, after all.
Have you reported an issue to Microsoft Connect (or their bug reporting platform, I do not remember the exact name)? For the record, IE9 will be officially released on March 14th. I still believe they won't do a change to the implementation even if the entire TC-39 committee was knocking at the IE team door in the next hour.
- we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through As said by someone else, the iteration order has never been standardized. It was dangerous from the developers to base their code on an implementation detail even though this one was at some point consistent accross browsers. Standards aren't only for implementors.
Your perspective is common in a group like this - very spec and standard focused. Isn't it fun to bash those developers? Everyone's doing it.. I hope you realize it's irrelevant though?
I can't talk for everyone in the list, but I came to that list as a web developer. I am still a student, I am still learning. I wouldn't define myself as "very spec and standard focused". As I said in a previous e-mail, I would have certainly made the mistake myself. My intention is not to bash anyone. I read and try to understand standards because I understand that a standard is a contract between the implementor and the programmer. In an ideal world, implementors respect the spec (and they have quite well for ECMAScript) and programmers can understand the language from the spec. In the real world, a spec is a difficult document to read. This is why people write documentation with tutorials and exampels. Have you checked that your favorite documentation to make sure it warned on the fact that for..in loops enumerate in an implementation-independent manner and that it could cause bug? If it's a wiki, have you edited it? If web developers don't want to take the time to understand the language, they're exposing themselves to bugs in the longer-run.
These developers took a calculated risk at a time when standards were so vague and partial that they had to take similar risks everywhere.
The DOM is a mess. DOM-related standards and actual implementations were a huge mismatch. Different story for ECMAScript. As far as I know, ECMAScript 3 has been followed in an interoperable manner in web browsers, IE6 included. And I quote ECMAScript 2, released in August 1998 (www.ecma-international.org/publications/standards/Ecma-262-arch.htm) about the for-in loop: "The mechanics of enumerating the properties (...) is implementation dependent." Almost thirteen years that it's specified as implementation-dependent. "These developers" didn't take a "calculated risk". They saw it worked with the implementations at the time and hoped it would be so in the future. I'm doing this too when in a rush. It doesn't prevent me from checking when the rush stops. 13 years leaves some decent time to check.
Checking MDN: * developer.mozilla.org/en/JavaScript/Reference/Statements/for...in#Description : "A for...in loop iterates over the properties of an object in an arbitrary order (see the delete operator for more on why one cannot depend on the seeming orderliness of iteration, at least in a cross-browser setting) ..."
- delete page (developer.mozilla.org/en/JavaScript/Reference/Operators/Special_Operators/delete_Operator#Cross-browser_issues): "Although ECMAScript makes iteration order of objects implementation-dependent, it may appear that all major browsers support an iteration order based on the earliest added property coming first (at least for properties not on the prototype). However, in the case of Internet Explorer, when one uses delete on a property, some confusing behavior results, preventing other browsers from using simple objects like object literals as ordered associative arrays. In Explorer, while the property value is indeed set to undefined, if one later adds back a property with the same name, the property will be iterated in its old position--not at the end of the iteration sequence as one might expect after having deleted the property and then added it back.
So if you want to simulate an ordered associative array in a cross-browser environment, you are forced to either use two separate arrays (one for the keys and the other for the values), or build an array of single-property objects, etc."
This part seem to have been added on January 12th 2010, so it seems to regard IE6-8.
Back to the idea of a standard being a contract between implementors and developers, implementors have respected their part of the contract quite well with regard to ECMAScript (once again, different story with the DOM). Developers should try to worry about the "contract" too.
ECMAScript is a programming language. It has been first designed in order to help non-programmers (semi-column insertion, silent failure, scripting language...) to write programs. But it requires some involvment anyway if you want your code to be robust over time. If you write code that is supposed to last one browser generation, you can rely on implementation details. If you want to write robust code, you have to make sure you understand the spec and are not relying on implementation specificities.
If reminding this means bashing to you, then I am sorry, I am bashing developers. I may be wrong, but I think it's relevent to remind that it isn't the "spec fault" if developers didn't follow it while implementations did.
On 3/11/2011 3:13 PM, David Bruant wrote:
Have you reported an issue to Microsoft Connect (or their bug reporting platform, I do not remember the exact name)?
I will report it the second that I can link to, say, an email from Brendan Eich replying to 2 other heavies saying yeah, we're agreed ES4 & Harmony should specify in-order iteration as the default strategy.
(I am not impugning your waistline Brendan).
I still believe they won't do a change to the implementation even if the entire TC-39 committee was knocking at the IE team door in the next hour.
Great idea David, I'll catch a flight if you will ;)
"These developers" didn't take a "calculated risk". They saw it worked with the implementations at the time and hoped it would be so in the future.
That is precisely the calculated risk they took. Many were aware that technically, the spec left it undefined, but again, saw the feature as so obviously valuable that there would never be a reason to go against a behavior all browsers agreed on.
And again, I agree with them - I think the language should be standardized in exactly the way they assumed it would.
If you could go back in time and explain to those developers that we now have browsers that are 10,000 times faster on hardware that's 10 times faster, but we're going to sacrifice in-order iteration in order to get a little bit more, I think they'd be stunned.
But it requires some involvment anyway if you want your code to be robust over time. If you write code that is supposed to last one browser generation, you can rely on implementation details. If you want to write robust code, you have to make sure you understand the spec and are not relying on implementation specificities.
On robustness, just to set perspective, today I have the luxury of the extra speed and memory to make my own LinkedHashMap in JavaScript, as I showed - at the time some of these decisions were made, there was not enough "breathing room" to do this. The applications would not have hit performance targets.
Think, in particular, of my second implementation in the context of the pre-IE7 garbage collector. Sure, it rips through a million keys in a couple seconds on a modern Firefox browser, it also trebles the GC load relative to an Object: this is "game over" on older IE.
If reminding this means bashing to you, then I am sorry, I am bashing developers. I may be wrong, but I think it's relevent to remind that it isn't the "spec fault" if developers didn't follow it while implementations did.
Not only do I agree it is not the spec's fault, I'm not interested in assigning blame at all, which was really my point. Whoever we point to as "at fault", the language gets worse and a bunch of developers waste their time.
On 2011-03-11, at 18:40, Charles Kendrick wrote:
"These developers" didn't take a "calculated risk". They saw it worked with the implementations at the time and hoped it would be so in the future.
That is precisely the calculated risk they took.
FWIW, OpenLaszlo does not take that risk, because we want to work across many platforms. When we care about iteration order, we don't use Object. What we do use depends on the application: Usually when we care about iteration order, random access is not an issue, so we just use a "plist" (an array of alternating keys and values).
On 03/11/2011 02:07 PM, Charles Kendrick wrote:
Your perspective is common in a group like this - very spec and standard focused. Isn't it fun to bash those developers? Everyone's doing it.. I hope you realize it's irrelevant though?
Insinuating bad faith ("Isn't it fun" and "Everyone's doing it") may not be a successful strategy for swaying others to your point of view. Just saying.
And I tend to agree with David Bruant about the makeup of the list being lightly-involved developers, true standardistas, implementers, spec writers, and more: a fairly mixed bunch with lots of different motivations and goals, to which few generalized sentiments can fairly be attributed.
On 3/11/2011 3:58 PM, Jeff Walden wrote:
On 03/11/2011 02:07 PM, Charles Kendrick wrote:
Your perspective is common in a group like this - very spec and standard focused. Isn't it fun to bash those developers? Everyone's doing it.. I hope you realize it's irrelevant though?
Insinuating bad faith ("Isn't it fun" and "Everyone's doing it") may not be a successful strategy for swaying others to your point of view. Just saying.
My apologies. It is frustrating to open a discussion with a proposal of what is best, and have it be redirected repeatedly into a discussion of who is to blame.
And I tend to agree with David Bruant about the makeup of the list being lightly-involved developers, true standardistas, implementers, spec writers, and more: a fairly mixed bunch with lots of different motivations and goals, to which few generalized sentiments can fairly be attributed.
You're certainly correct that it's a mixed bunch of roles, but that's within the very narrow subset of developers who know about the standards process and care enough to be here.
However I would never attribute a general sentiment to any group, that's almost always wrong. I said a certain perspective would be more common. It's been my (repeated) personal experience that you get very different responses on a list devoted to standards than from (say) an enterprise development shop. Easily verified.
I would just like to give my 2 cents.
I would like some sort of iterative ordering, but not on Objects. The reasons have been stated before that iterative ordering is less optimal for performance than random access (or semi-random). However, it has its uses, perhaps it would be more suitable to put this in proxies like people have stated, which in some ways is true. Lexiconographical and declaration orderings are quite different but quite common, and even then there could be a more obscure ordering that is left out. This topic I feel is a complex one that has implementation implications if we provide it on all objects, as well as the added complexity of runtime mutation of properties needing consideration (it would be odd for object literals to be treated differently from other objects). As well as the possible reliance of people on object indexes which have no real ordering:
var i = 0 for(var k in obj) { switch(i) { //stuff based upon i instead of k to determine your actions even though relying on k as a context when producing the object } i++ }
is a confusing but possible code example of independence of properties and location that would be awkward. If we do go with ordering, this is a possible example of misuse (it is a simple case but would be throwing around junk strings of k without really wanting to know what k actually says).
For now I would stick with arrays for ordering as ordering multiple objects together implies a relationship which should be encapsulated (because they are related somehow rather than by implicit knowledge of how ordering is going to line them up) in another object rather than symbol table. And I hope we get proxy support everywhere so we can do interesting for ... in loops soon.
I am going to side with voting no on this. It seems that the idea of sorted enumeration is a great one, but the reasons to put it on the enumeration of objects is less than stellar. Most of the reasons appear to be due to legacy systems or possible memory saving, but I am not convinced you would save memory/cpu due to the need of tracking/sorting this order of properties whenever a for in loop is performed (assuming we can do it lazily). In reality I see it as a native code implementation of what is being done with libraries today when they do need to keep track of multiple fields at the same time, but a reliance on this would mean instead of only doing it as needed, the native implementation would perform this every single time, slowing down setting of properties as well as all for ... in loops.
On Mar 11, 2011, at 12:11 PM, Charles Kendrick wrote:
And again, I find your notion that an Object is "obviously" a HashMap very suspect.
John's simply mistaken :-).
Take it from me, JS objects are not hashmaps and any would-be implementor who tries that and tests on real code is quickly disabused of the notion. It's not going to change, for named properties or any kind of property name.
The fact that many optimizing VMs bend the rules for array elements, if not for all indexed properties, suggests but of course does not prove that insertion order is a de-facto requirement for non-indexed "own" properties.
More on this when I've caught up on the thread, but I wanted to back your contention, not merely that it's not obvious that JS objects are hashmaps, but that they are not any such simple thing. True fact!
On Mar 11, 2011, at 12:49 PM, Charles Kendrick wrote:
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked":
I hear tell of something happening next Monday. Goose, well-done, stuffed, I think.
- we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through
IE9 seems to match the strawman:enumeration strawman.
- the majority of real-world applications will be slowed down because developers will need to re-implement the very commonly needed LinkedHashMap behavior in JavaScript
Where is this LinkedHashMap thing from? Oh, Java.
Again, IE9 is late to the party in breaking insertion order for indexed properties.
- developers will waste a bunch of time doing this when the language could have provided it
Can you cite some open source JS libraries or apps that would need to change?
I believe it is very very important that the ECMAScript standard specify that when a new Object is created, for..in iteration traverses properties in the order they are added, regardless of whether the properties are numeric or not.
Some users might prefer 'in the order of keys'. That is predictable, and allows for efficient implementation (both sparse and dense).
Are you suggesting changing the enumeration order which is currently implemented as "iteration order matches insertion order" under the careful set of conditions Allen enumerated earlier in this thread? (e.g. no enumerable props on prototype).
Normally, I don't like to object to features that group A finds useful and group B is happy to implement, just because I wouldn't use them myself. Let me explain why I jumped in anyway:
Current situation: the standard not only doesn't specify any ordering, it explicitly specifies that no ordering is to be relied on. So, I don't rely on any ordering, and those who do are mostly happy because most implementers happen to support an ordering.
Unfortunately, that isn't the happy situation it seems to be - the first implementers may have broken rank already, others have said that they just haven't got round to investigating the potential gains.
Proposed situation: the standard specifies the current insertion order. As has been pointed out, current practice has several exceptions, so one would have to (a) specify a simple insertion order and force implementers to go against the direction they like to go or (b) specify the current least common denominator and add to the complexity of the specification.
From the discussion here, I'm not even convinced that everyone
asking for an insertion ordering is thinking about the same thing. If all participants were happy with
http://wiki.ecmascript.org/doku.php?id=strawman:enumeration
(is it just me, or does this page have a character problem with
tabs in the table?)
then we wouldn't have this thread, right?
As a user, I still don't have an immediate problem if I ignore that ordering, but as a toolsmith, I now have to support this ordering; when reasoning about someone else's code, I have to think about this ordering and whether the author relied on it; and since supporting the spec will become another little bit more complicated, there will be fewer tools attempting to do so, a shortcoming which will bite me as a user, too.
So, ultimately, I find it acceptable that the ordering is not specified, but I doubt that this is a stable situation for coders or implementers. If the ordering is to be specified, then yes, I would prefer a simple ordering that doesn't impede implementers, instead of an ordering that looks simple ("just as written"), but becomes complicated when written down in full, with all "ifs" and "buts".
'In the order of keys' might be such a simple ordering, and has the additional advantage that 'order' could be an overridable operation, just like a sort parameter: alphabetical for string keys in Object, numerical for number keys in array, by insertion record for some new type of object.
If so - the thought of this makes me pretty nervous. There is undoubtedly significant amounts of code in the wild on the web which depend on the current enumeration order, and changing this would increase the size of the harmony-uptake tax (code audits and refactors).
Specification complexity makes me even more nervous than code that relies on others to clean up behind it.
It might not be immediately obvious, given that Ecmascript's spec is so complicated already, but once the spec gets under a certain barrier of complication (both in what it specifies and how it specifies it), tools get much easier to implement. Some automatic transformations become provably correct, and even if that goal is still far off, I would expect tools that highlight problematic code automatically (leaving confirmation/fix to coders) to be viable, iff the objects that rely on insertion ordering are known (by comments/annotation/subclassing).
In theory, I would hope any code that goes against the spec to be accompanied by assertions/regression tests that fail in useful ways when browsers change order. Then one could write tools that flag enumerations that might be the cause of trouble. In practice, that is probably too much to hope.
In practice, one might want to start flagging enumerations that rely on ordering: whenever you find such, wrap the object in an identity function, just so that the code is easy to find when or if it needs fixing:
for (var i in obj) { .. } ==> for (var i in inorder(obj)) { .. }
// NOTE: caller relies on insertion ordering of obj keys!
// (see test #42 and documentation)
function inorder (obj) { return obj; }
It isn't so much a tax, and not for Harmony, but more of a debt, right now, taken in the hope of a future spec waiving the payback.
That debt might be manageable. Whereas the tax of further complicating the spec or the language described by it would not.
Claus
Below are two partial implementations of LinkedHashMap in JavaScript, with test code to populate with lots of keys and remove half the keys at random, then alert() the results. Both implementations add and maintain an index to provide O(1) lookups.
Each is compared to the speed of the equivalent use of Object assuming it supports ordering, which is just to use the JavaScript "delete" operator to remove keys.
The first is a straightforward implementation where insert and removal use splice() to modify an Array. This is what most developers would probably implement.
I'd hope not: indexOf on a large Array?
This is approximately 800-1000 times slower than a native Object.
The second leverages native speed better by using two Objects to store the forward and reverse pointers of a classic linked list.
I'll stipulate that everyone on this list would arrive at this implementation, but in my opinion a vanishing small number of JavaScript programmers at large would figure it out.Even so it's 8-10x slower than native.
I notice that you don't count initialization for the native Object variant. Apart from fixing that, would the following variant help (run, but not tested;-)? It keeps the previous/next key in the key's internal value, to reduce overhead.
Claus
// another partial implementation (function () {
var sourceDataArr = []; var keysToRemove = [];
for (var i = 0; i < 30000; i++) { sourceDataArr[sourceDataArr.length] = "key" + i; sourceDataArr[sourceDataArr.length] = "value" + i; if (Math.random(1) > 0.5) { keysToRemove[keysToRemove.length] = "key" + i; } }
var orderedMap = { init : function (data) { this.index = {}; for (var i = 0; i < data.length; i++) { // [prev, value, next] this.index[data[i]] = [data[i-2],data[i+1],data[i+2]]; i++; } this.firstKey = data[0]; this.lastKey = data[data.length-2]; }, get : function (key) { return this.index[key][1]; }, remove : function (key) { var which = this.index[key];
if (which[0]) // is there a previous?
this.index[which[0]][2] = which[2];
else // key was first
this.firstKey = which[2];
if (which[2]) // is there a next?
this.index[which[2]][0] = which[0];
else // key was last
this.lastKey = which[0];
delete this.index[key];
}
};
var start = new Date().getTime(); var sourceDataObj = {}; for (var i = 0; i < sourceDataArr.length; i++) { var key = sourceDataArr[i], value = sourceDataArr[i+1]; sourceDataObj[key] = value; i++; } for (var i = 0; i < keysToRemove.length; i++) { delete sourceDataObj[keysToRemove[i]]; } alert("object: " + (new Date().getTime() - start));
var start = new Date().getTime(); orderedMap.init(sourceDataArr);
for (var i = 0; i < keysToRemove.length; i++) { orderedMap.remove(keysToRemove[i]); } alert("orderedMap: " + (new Date().getTime() - start));
}());
Le 12/03/2011 09:06, Brendan Eich a écrit :
On Mar 11, 2011, at 12:49 PM, Charles Kendrick wrote:
Yes Allen, hence the urgency. If IE9 final ships that way, the "goose is cooked":
I hear tell of something happening next Monday. Goose, well-done, stuffed, I think.
- we will have a new de facto standard iteration order for Object that does not match any known use case - it is purely an implementation detail leaking through
IE9 seems to match the strawman:enumeration strawman.
- the majority of real-world applications will be slowed down because developers will need to re-implement the very commonly needed LinkedHashMap behavior in JavaScript
Where is this LinkedHashMap thing from? Oh, Java.
Again, IE9 is late to the party in breaking insertion order for indexed properties.
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included).
In ECMAScript 5 (+# notation), if I want all index properties of an object o in ascending order, I can do something along the lines of: Object.getPropertyNames(o) .filter(#(e){ e.match(/[1-9]\d+/) }) // remove non-number strings .sort(#(a,b){ Number(a) < Number(b) ?1:-1 }); // sort in ascending numeric order My point is that I do not need the ES engine to provide this. I can get it myself. However, only the engine can know the information of the overall insertion order (if I have inserted the "2" property before the "1" for instance).
What is the rational or use case behind having index properties at first for objects and then the rest of properties?
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included).
This is an issue in theory. Beware a priori reasoning about usability issues.
In practice both users and (especially) implementors Do Not Want indexed properties enumerated in insertion order. The proof is the use of for-in over arrays, still common enough, also advised against (but that is like talking back to the tide).
What is the rational or use case behind having index properties at first for objects and then the rest of properties?
The "rationale" (such as it is) is that JS conflates lists and dicts in objects, but users mostly think about one or the other. When combining indexed and named properties, many users still want for-in to work sensibly and that means the list properties first in index order, the dict properties after in insertion order.
On Mar 12, 2011, at 10:41 AM, Brendan Eich wrote:
What is the rational or use case behind having index properties at first for objects and then the rest of properties?
The "rationale" (such as it is) is that JS conflates lists and dicts in objects, but users mostly think about one or the other. When combining indexed and named properties, many users still want for-in to work sensibly and that means the list properties first in index order, the dict properties after in insertion order.
In theory it would be equally reasonable to enumerate the index properties (in order) after all the non-index properties. In practice, implementations that have gone down this road have placed the index properties first.
On Mar 12, 2011, at 10:41 AM, Brendan Eich <brendan at mozilla.com> wrote:
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included).
This is an issue in theory. Beware a priori reasoning about usability issues.
In practice both users and (especially) implementors Do Not Want indexed properties enumerated in insertion order. The proof is the use of for-in over arrays, still common enough, also advised against (but that is like talking back to the tide).
I don't understand why this is still being discussed as a single behavior across Array and Object.
If we define the iteration order as:
- Object: in-order including indices
- Array: indices first
Then:
- There's no information loss going from Object literal to live Object
- Array has the for..in behavior people expect
- Object has a useful behavior (similar to Java LinkedHashMap) instead of a surprising behavior (treating indices specially)
This is aside from the performance and backcompat benefits covered previously.
You mentioned JS conflates Arrays and Objects - so - let's stop doing that :)
Bradley, the proposal is to define the iteration order for Object only, not all objects (eg Array).
Also, if you were choosing between:
-
The strawman: preserve insertion order for non-index properties only, on both Object and Array
-
My proposal: preserve insertion order for all properties on Object only, leave other objects undefined
Then what would you choose? Because there seems to be existing momentum to define an order of iteration, so it may be that what you seem to prefer (leave it totally undefined) is not a possibility.
Le 12/03/2011 19:41, Brendan Eich a écrit :
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). This is an issue in theory. Beware a priori reasoning about usability issues.
I fully agree. I don't have myself a decent use case for that. My point was just that if we standardize the order as in the strawman, we will have lost forever the possibility to retrieve overall property order...
In practice both users and (especially) implementors Do Not Want indexed properties enumerated in insertion order. ... but if especially implementors aren't willing to implement keeping overall property order, then I understand that losing such a potential can be considered at the spec level.
The proof is the use of for-in over arrays, still common enough, also advised against (but that is like talking back to the tide).
What is the rational or use case behind having index properties at first for objects and then the rest of properties? The "rationale" (such as it is) is that JS conflates lists and dicts in objects, but users mostly think about one or the other. When combining indexed and named properties, many users still want for-in to work sensibly and that means the list properties first in index order, the dict properties after in insertion order.
I understand the practice of users. However, this practice has started to be forged at a time when the only cross-browser way to iterate over own enumerable properties was for..in
- hasOwnProperty. This is a different story now (ES5) that we have all object introspection methods (except in Opera according to Kangax compat-table) and Array extras.
I also would like to remind that the for..in enumeration order is used in different places (for instance the Object.defineProperties methods, which also means Object.create (second argument)). Losing the overall order in for..in loops also means loosing it for these methods. It also means loosing it for proxies if I use Object.defineProperties(myProxy, props), which will mean that even with proxies, I won't be able to perfectly track property insertion order. I'll have some dependency on the engine for..in order. So I was actually wrong when I said earlier in the thread that proxies could be used to implement OrderedObject.
If users mostly think of JS objects as lists or dicts, would it make sense to provide methods such as Object.numericProperties(o) and Object.dictionaryProperties(o) (sorry, not very inspired for the name)?
On Mar 12, 2011, at 11:58 AM, David Bruant wrote:
Le 12/03/2011 19:41, Brendan Eich a écrit :
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). This is an issue in theory. Beware a priori reasoning about usability issues. I fully agree. I don't have myself a decent use case for that. My point was just that if we standardize the order as in the strawman, we will have lost forever the possibility to retrieve overall property order...
No you haven't. There is nothing stopping somebody in the future from proposing: Object.keyInCreationOrder(obj)
Whether you will be able to convince anyone to accept and implement such a proposal proposal is different matter.
I also would like to remind that the for..in enumeration order is used in different places (for instance the Object.defineProperties methods, which also means Object.create (second argument))...
That's a good point. I'm not sure it is a significant issue but it is one that should be considered.
Le 12/03/2011 20:58, David Bruant a écrit :
Le 12/03/2011 19:41, Brendan Eich a écrit :
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). This is an issue in theory. Beware a priori reasoning about usability issues. I fully agree. I don't have myself a decent use case for that. (...)
I have thought of a use case: Music album in which one title is a number. I lose the opportunity to store the album as a dictionary indexed on titles and sorted the order I have inserted the keys in. I would be forced to use an array. This is where the notion of dictionary finds some limitation.
David, bad faith indeed :-)
Ps : And yes, there are such albums: Moby - 18 - 18 Moby - Ambient - 80 Moby - Play - 7 (Moby's good at this game) Phoenix - Wolfgang Amadeus Phoenix - 1901 Coldplay - Viva la Vida or Death and All His Friends - 42 Lily Allen - It's Not Me, It's You - 22 ...
Le 12/03/2011 21:24, Allen Wirfs-Brock a écrit :
On Mar 12, 2011, at 11:58 AM, David Bruant wrote:
Le 12/03/2011 19:41, Brendan Eich a écrit :
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). This is an issue in theory. Beware a priori reasoning about usability issues. I fully agree. I don't have myself a decent use case for that. My point was just that if we standardize the order as in the strawman, we will have lost forever the possibility to retrieve overall property order... No you haven't. There is nothing stopping somebody in the future from proposing: Object.keyInCreationOrder(obj)
I spoke too quickly indeed. This wouldn't however solve the problem with the proxy not being able to implement reliably OrderedObject, because of the dependency Object.defineProperties has on the internal iteration order.
On Mar 12, 2011, at 11:27 AM, Charles Kendrick wrote:
On Mar 12, 2011, at 10:41 AM, Brendan Eich <brendan at mozilla.com> wrote:
On Mar 12, 2011, at 9:54 AM, David Bruant wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included).
This is an issue in theory. Beware a priori reasoning about usability issues.
In practice both users and (especially) implementors Do Not Want indexed properties enumerated in insertion order. The proof is the use of for-in over arrays, still common enough, also advised against (but that is like talking back to the tide).
I don't understand why this is still being discussed as a single behavior across Array and Object.
It's an attempt to further unify, or if you prefer, conflate, arrays and objects.
It may be a mistake. The enumeration strawman is young, and we can revise it. In particular I think David's point about a record catalog wanting insertion order even when property names happen to look like indexes is a good one.
If we define the iteration order as:
- Object: in-order including indices
- Array: indices first
Then:
- There's no information loss going from Object literal to live Object
- Array has the for..in behavior people expect
These are good points, especially in light of David Bruant's observation about enumeration order mattering more in the new ES5 meta-object APIs.
- Object has a useful behavior (similar to Java LinkedHashMap) instead of a surprising behavior (treating indices specially)
The argument from Java leaves many cold. Best to focus on backward compatibility and consistency in one dimension (not with Array; rather, consistent insertion order, no matter the property's name).
This is aside from the performance and backcompat benefits covered previously.
The performance benefits are secondary, but you're ignoring the benefits of the strawman for those who (for whatever reason) use objects as if they were (dense, optimizable) arrays. This happens, you've seen some of the references. Perhaps these people should rewrite their code, but now we are in a perverse game: Chrome and possibly other browsers (I haven't tested IE9) optimize their code better than other browsers.
The backward compatibility point seems very strong to me, and I gather Chrome is feeling some heat still. We'll see how IE9 fares.
You mentioned JS conflates Arrays and Objects - so - let's stop doing that :)
If only that genie could be put back in the bottle.
It would be helpful if V8 and Chakra people spoke to the question of preserving indexed property insertion order on non-Array objects.
On 03/12/2011 12:02 AM, Brendan Eich wrote:
Take it from me, JS objects are not hashmaps and any would-be implementor who tries that and tests on real code is quickly disabused of the notion. It's not going to change, for named properties or any kind of property name.
This is true.
It is also the view of people who are significantly closer to implementations than most web developers are. HashMap is still probably the better abstraction for most people's purposes. It's not best for all, certainly, as some people people either reverse-engineer the enumeration order or find one of the places that happens to document it. (MDN documents the behavior as an implementation extension at least one place if memory serves. The dime-a-dozen DHTML-espousing site from which I originally learned JS didn't document it.) Yet a substantial number of people never learn of the property ordering behaviors in web browsers. So while HashMap is far from what web-quality implementations do, it is generally (there are certainly exceptions) not far from how web developers use objects.
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). .. Music album in which one title is a number. I lose the opportunity to store the album as a dictionary indexed on titles and sorted the order I have inserted the keys in. I would be forced to use an array. This is where the notion of dictionary finds some limitation.
Please note that this use case highlights the highjacking of numeric Strings as indices, not the lack of overall property addition order including indices.
A spec workaround would be to stop converting numeric keys to Strings, ie, 1 and '1' would be different keys. Then number keys could behave as Array indices, while String keys would behave as other properties. This would avoid the gaps in the String keys highlighted by your use case, but you would still not get a full record of insertion order. Doing that might make insertion ordering slightly more palatable, though.
Btw, if you really need to organize your music now, and don't feel like using a proper LinkedHashMap, you could prefix all your keys with ':' or something similarly non-numeric;-) That would avoid the auto-conversion/index ordering, at the price of messing up your access code.
Claus
Le 13/03/2011 10:57, Claus Reinke a écrit :
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). .. Music album in which one title is a number. I lose the opportunity to store the album as a dictionary indexed on titles and sorted the order I have inserted the keys in. I would be forced to use an array. This is where the notion of dictionary finds some limitation.
Please note that this use case highlights the highjacking of numeric Strings as indices, not the lack of overall property addition order including indices.
A spec workaround would be to stop converting numeric keys to Strings, ie, 1 and '1' would be different keys. Then number keys could behave as Array indices, while String keys would behave as other properties. This would avoid the gaps in the String keys highlighted by your use case, but you would still not get a full record of insertion order. Doing that might make insertion ordering slightly more palatable, though.
Interesting idea, but I think it would be to big of a break from ES5, ES in general and JS in reality.
o = {}; o['1'] = 16; Object.getOwnPropertyNames(o); // ['1'] o[1] = 12; Object.getOwnPropertyNames(o); // ['1'] and not ['1', 1]
You would also break the invariant o[1] === o['1'] which may be used a lot when people retrieve numbers from <input> fields. In my opinion,
property name string conversion seems to be too deeply anchored in the language and usages to be questioned now.
On Sun, Mar 13, 2011 at 5:57 AM, Claus Reinke <claus.reinke at talk21.com>wrote:
Please note that this use case highlights the highjacking of numeric
Strings as indices, not the lack of overall property addition order
including indices.
A spec workaround would be to stop converting numeric keys to Strings, ie, 1 and '1' would be different keys. Then number keys could behave as Array indices, while String keys would behave as other properties. This would avoid the gaps in the String keys highlighted by your use case, but you would still not get a full record of insertion order. Doing that might make insertion ordering slightly more palatable, though.
Btw, if you really need to organize your music now, and don't feel like using a proper LinkedHashMap, you could prefix all your keys with ':' or something similarly non-numeric;-) That would avoid the auto-conversion/index ordering, at the price of messing up your access code.
If you are trying to use an object to store arbitrary values as a hash map, you already have to do something like this -- otherwise you run into problems with trying to store various values like "prototype", "proto", "watch", etc. (and the list of dangerous values varies by browser).
If you "know" your data can't conflict, then of course you can use it directly, but then you are likely to have subtle bugs when your assumption turns out to be wrong.
A couple of quick notes:
-
I just updated strawman:enumeration to cite this thread by linking to the head message from Charles. Good feedback here, we on TC39 are processing it and we'll talk about it at the meeting in two weeks.
-
Dave Herman just wrote up strawman:dicts which he proposed to me last week in reaction to the same old chestnut of an issue that you raise here: the pollution of objects-as-dictionaries by prototype-delegated property names, and depending on the implementation a few magic built-ins.
Comments welcome on dicts, I'm sure.
On Mar 13, 2011, at 4:57 AM, Claus Reinke wrote:
The little "issue" I see in returning 1) index properties in ascending order 2) all other properties in addition order is that there is a bit of information lost in the process: overall property addition order (index properties included). .. Music album in which one title is a number. I lose the opportunity to store the album as a dictionary indexed on titles and sorted the order I have inserted the keys in. I would be forced to use an array. This is where the notion of dictionary finds some limitation.
Please note that this use case highlights the highjacking of numeric Strings as indices, not the lack of overall property addition order including indices.
That is one point of view. Unfortunately, the horses have left the barn and are on different continents, or planets, now.
It's likely some developers take the point of view that non-array, or even any-object, enumeration should enumerate indexed properties in index order, the rest in insertion order. While other developers, and I can believe with the weight of history on their side more developers, expect insertion order as a rule at least for non-arrays.
I wrote recently in a private reply on this topic:
"... for-in is too "DWIM" and fuzzy, and often developers don't know exactly what they mean. Determining "most used intersection semantics" (which might be one mode; there could be other important but less-used modes) is hard.
That's why in paren-free I proposed not making subtle runtime changes to the meaning of for (x in o) in Harmony, rather banning that syntax. It won't pay to inflict too much runtime semantic innovation on the same old syntax, especially where the old syntax and semantics are misunderstood and even variable across implementations.
Instead, the idea is to use standard iterators to say what you mean, as in Python. The default for x in o (paren-free) still should do enumeration on objects, so we need to pin down enumeration. As David noted, it is used internally in ES5 in a number of places, even though underspecified horribly."
A spec workaround would be to stop converting numeric keys to Strings, ie, 1 and '1' would be different keys.
This is how JS as I implemented it in 1995 at Netscape, and on until some point in 1997 or 1998 (in any event, before the creation of cvs.mozilla.org from Netscape's internal CVS repo), worked: indexes that fit in a certain tagged int representation were reflected as number-type (typeof sense) keys.
Obviously this was too implementation-dependent, but also better for efficiency and often enough, for users who wanted to avoid string-type indexes.
Again, I think trying to change for-in's runtime semantics too much, without breaking its syntax is a mistake. Even indexed-first-in-index-order may be a bridge too far, as Charles and others argue and as the heated discussion in the V8 issue indicate.
With paren-free for-in, users can select from standard iterators to visit keys, values, items (key-value pairs), indexes (of numeric type), etc.
The question of the default meaning of for x in o {...} in a paren-free Harmony remains, and as noted above (David Bruant's point), the spec uses enumeration internally. So we nee to pin something down.
My inclination right now, frankly, is to avoid compatibility hassles and runtime-testing migration taxes by codifying what most engines across the current in-the-field and latest-release (and bleeding-edge) browsers do, and not make indexed-first-in-index-order universal for all objects.
Who needs the migration hassle? If paren-free syntax becomes Harmonious, there will be plenty of sugar to make up for default-enumeration being the same old salt.
I do agree with your point that "proper coding" means prefixing string keys to avoid indexes being enumerated in other than insertion order, but again: reality with JS is messy; programmers don't all code "properly"; many eyes will glaze over at your "LinkedHashMap" Java talk. ;-)
Brendan, Bradley and others - there's no need to search for a relatively uncommon use case for ordered maps indexed by keys - a far more common use case is enumerated data stored numerically, eg, marital status:
{ 2: "Married", 1: "Single", 5: "Divorced" }
Likewise maps from numeric primary key values to display values for the records.
So this use case comes up just about every time JavaScript is manipulating data coming from a SQL database - extremely common.
On Mar 12, 2011, at 3:47 PM, Brendan Eich <brendan at mozilla.com> wrote:
- Object has a useful behavior (similar to Java LinkedHashMap) instead of a surprising behavior (treating indices specially)
The argument from Java leaves many cold.
Argument from Java??? You wound me, sir! I would never argue from the viewpoint of such a verbose language and far prefer your JavaScript.
To rephrase my argument so that it does not appear to be rooted in Java: take any library of collection classes from any language: C, C++, Python, perl, Ruby, Forth, whatever: do you ever find a collection class that is written to preserve order of string keys but not numeric keys?
I don't know of one.
So we're looking at standardizing a behavior for Object that no one has ever felt the need to implement as a reusable component.
I would argue this means we're standardizing a very rare use case.
This is aside from the performance
Perhaps these people should rewrite their code,
To be fair we're talking about a pretty tiny rewrite, in many cases a one-line change. I made the change to JQuery is about 5 lines, with no apparent functional difference in a medium-size app.
If someone can point out a benchmark involving JQuery where Chrome appears to win due to dense Arrays, I can try it out, verify a performance difference if it's real, and submit the patch. After all it applies to current Firefox regardless of what is decided here.
but now we are in a perverse game: Chrome and possibly other browsers (I haven't tested IE9) optimize their code better than other browsers.
Glad to see the phrase "perverse game" here. I feel that the right behavior for the language is in danger of being sacrificed for an advantage in synthetic benchmarks that don't reflect the real-world use cases we should really be optimizing for.
On Mar 13, 2011, at 3:58 PM, Charles Kendrick wrote:
Brendan, Bradley and others - there's no need to search for a relatively uncommon use case for ordered maps indexed by keys - a far more common use case is enumerated data stored numerically, eg, marital status:
{ 2: "Married", 1: "Single", 5: "Divorced" }
It's still a synthetic example, and real web sites/apps would be more compelling and helpful for forge consensus.
Likewise maps from numeric primary key values to display values for the records.
So this use case comes up just about every time JavaScript is manipulating data coming from a SQL database - extremely common.
Too assert-y, sorry. I'm on your side at this point (see my reply to Claus, just sent). But evidence trumps assertions or suppositions, however plausible.
Argument from Java??? You wound me, sir! I would never argue from the viewpoint of such a verbose language and far prefer your JavaScript.
No offense. In my reply to Claus I said developers' eyes glaze over at LinkedHashMap. I'm not putting anyone down, on any side of this. JS's objects often "DWIM" and this is a strength as well as a weakness. It's hard to say it is purely a bug, IM(truly Humble, I hope, due to feeling developer pain these almost-16 years)O.
So we're looking at standardizing a behavior for Object that no one has ever felt the need to implement as a reusable component.
I would argue this means we're standardizing a very rare use case.
It may be rare; it's hard to measure frequency of use (static or more imp. dynamic use). This indexed-first-in-indexed-order idea has come from implementations, I think. The implementors may be reacting to user demands too, but the optimization wins carry weight and may be dominant.
This could indeed be the wrong way to evolve the spec.
On Fri, 11 Mar 2011 16:48:04 +0100, Charles Kendrick
<charles at isomorphic.com> wrote:
Just connecting the dots - I addressed this in my first email on the
subject. While it superficially sounds like a good compromise, I
actually think it's the worst possibility: it requires browser vendors
to implement limited order preservation, preventing deeper optimizations
like sorted keys.
Implementors already have to do that. In V8, some objects have their
properties backed by a hashmap. This just means that it needs an extra
field
in that hashmap (a 50% increase in size) to store the insertion index, and
then
extract and sort the keys by insertion index before iterating over them.
That also means that some ideas that might work well in one implementation,
like:
function hasKey(object) { for(var _ in object) return true; return
false; }
(which I believe works quickly in SpiderMonkey) will work much worse in
other
implementations - by at least a factor of 10.
So yes, (some) objects are hashmaps ... plus some more.
/L 'firmly in the "they are not ordered and never were, damnit!" camp'
On Mar 14, 2011, at 5:09 AM, Lasse Reichstein wrote:
On Fri, 11 Mar 2011 16:48:04 +0100, Charles Kendrick <charles at isomorphic.com> wrote:
Just connecting the dots - I addressed this in my first email on the subject. While it superficially sounds like a good compromise, I actually think it's the worst possibility: it requires browser vendors to implement limited order preservation, preventing deeper optimizations like sorted keys.
Implementors already have to do that. In V8, some objects have their properties backed by a hashmap. This just means that it needs an extra field in that hashmap (a 50% increase in size) to store the insertion index, and then extract and sort the keys by insertion index before iterating over them.
How do you get a 50% figure? I've studied V8 and JavaScriptCore as well as implemented lots of SpiderMonkey. Properties (however shared) need id, attributes, and value or offset. That's at least three words on practical architectures. What am I missing?
That also means that some ideas that might work well in one implementation, like: function hasKey(object) { for(var _ in object) return true; return false; } (which I believe works quickly in SpiderMonkey) will work much worse in other implementations - by at least a factor of 10.
Can you show a micro-benchmark with some data?
So yes, (some) objects are hashmaps ... plus some more.
/L 'firmly in the "they are not ordered and never were, damnit!" camp'
But you just wrote a whole message about how objects have to expend space to preserve insertion order for enumeration! Saying you wish something weren't the case != saying it's not the case. :-P
Web developers find and exploit many de-facto standards. Enumeration order being insertion order for non-arrays at least, if not for all objects (arrays tend to be populated in index order), is old as the hills and web content depends on it, as far as I can tell. I'll do some experiments to try to get data on this.
On Mon, Mar 14, 2011 at 10:21 AM, Brendan Eich <brendan at mozilla.com> wrote:
Web developers find and exploit many de-facto standards. Enumeration order being insertion order for non-arrays at least, if not for all objects (arrays tend to be populated in index order), is old as the hills and web content depends on it, as far as I can tell. I'll do some experiments to try to get data on this.
Aside from the JSON example of populating a dropdown list given (which I will agree is a real if contrived use case), there has been a lot of talk of "thousands of web developers" depending on preserving insertion order, but not one concrete example -- do you have one?
Aside from the JSON example of populating a dropdown list given (which I will agree is a real if contrived use case), there has been a lot of talk of "thousands of web developers" depending on preserving insertion order, but not one concrete example -- do you have one?
Two examples I've seen recently in projects, both relying primarily on the for-in iteration order of an object:
-
exporting objects (like JSON, etc) to log files (server-side javascript)... needing a reliable order for the keys to be printed to the log file output, like the "datetime" field first, etc. A variation on this is using JSON.stringify(obj) and wanting the JSON output to have a reliable output order, also for log files.
-
Using an object literal as a UI/form "configuration" where a each field of the object represents a form element in a form-builder UI. If the iteration order of the object is different in different engines/browsers, the UI ends up being displayed in different orders.
A spec workaround would be to stop converting numeric keys to Strings, ie, 1 and '1' would be different keys. Then number keys could behave as Array indices, while String keys would behave as other properties.
Interesting idea, but I think it would be to big of a break from ES5, ES in general and JS in reality.
o = {}; o['1'] = 16; Object.getOwnPropertyNames(o); // ['1'] o[1] = 12; Object.getOwnPropertyNames(o); // ['1'] and not ['1', 1]
Well, 1 wouldn't be a "name", it would be an "index", right? So there'd be an Object.getOwnPropertyIndices(o), returning [1]. And there'd be the Array interface with further index operations.
You would also break the invariant o[1] === o['1'] which may be used a lot when people retrieve numbers from <input> fields. In my opinion, property name string conversion seems to be too deeply anchored in the language and usages to be questioned now.
Breaking that invariant would be the core of the change. It wouldn't be a minor change (one would need to make decisions and work out the consequences), and I mainly raised the idea to provide a different viewpoint.
Often in language design, thinking through a non-standard idea can help to elucidate other aspects of the problem, even if the idea itself might turn out to be unimplementable. And occasionally, an unlikely idea turns out to lead to a consistent and desirable changeset.
Some aspects I'd like to highlight:
1 currently, every Object has "numeric" indices (via conversion),
so Array is really just a mixin that provides more operations
for working with those indices (using indices without Array is
arguments-is-not-an-Array all over again)
2 if the enumeration strawman gets accepted (and in current practice anyway), those numeric indices get stolen from the property names (without the enumeration spec, the theft isn't noticable according to ES5, at least not so directly?)
3 separating numeric indices from property names would acknowledge 1 while lessening the impact of the enumeration strawman, solving 2; it comes with its own consequences that would need careful checking, but not separating indices from names will not make 1 or 2 go away
4 separating indices from names would also open the possibility of making indices available only in Arrays, so Array would become a proper sub-"class" rather than a mixin, solving 1
Claus
Le 14/03/2011 17:02, John Tamplin a écrit :
On Mon, Mar 14, 2011 at 10:21 AM, Brendan Eich <brendan at mozilla.com <mailto:brendan at mozilla.com>> wrote:
Web developers find and exploit many de-facto standards. Enumeration order being insertion order for non-arrays at least, if not for all objects (arrays tend to be populated in index order), is old as the hills and web content depends on it, as far as I can tell. I'll do some experiments to try to get data on this.
Aside from the JSON example of populating a dropdown list given (which I will agree is a real if contrived use case), there has been a lot of talk of "thousands of web developers" depending on preserving insertion order, but not one concrete example -- do you have one?
I gave one esdiscuss/2011-March/013036, esdiscuss/2011-March/013036
It is theorical (sorry, not concrete like "there is code using it right now"), but the point is that for objects (actual objects not arrays) used as "dictionaries", numbers could be used as "alphabetic keys". When the user writes objects as object literals in code, they might (not a single proof since it has never been the case) appreciate if the JS engine kept the order they have written the key in.
The order the user provides the keys in is the only bit of information (I can think of) that the JS engine looses. But once again, if users would have appreciate this feature, they have been forced (by spec and implementations) to find other ways. So, unless we can reach all web devs to ask "have you ever been disappointed of the implementation for-in loop order?", you cannot really have facts on if they would use the feature. Apparently, for the case of non-numeric properties, they seem satisfied of the implementation which iterate them over
For more concrete numerical results, let's wait for Brendan to do his experiments (or help him out if there is any way to do so?)
I think that in-memory objects (created as such) for-in enumeration and JSON.parse created objects are two different concerns. Of course, they should certainly be solved consistently, but they are different in my opinion.
I am a bit worried about JSON. JSON as an interchange format is defined as unordered. "An object is an unordered collection of zero or more name/value pairs". If people expect JSON as an interchange format to be ordered, they are assuming something about the JSON parse method which isn't itself standardized either.
One conceptual issue I see with ordering in the ES spec JSON.parse-created objects is that the ECMA spec would explicitely say "we consider that objects are ordered and we decide an order for JSON.parse-created objects, (we decide of an order while there isn't in the JSON format)". Once again, it's conceptual. I would be in favor of adding an order, but it had to be said. All of that said, it's a bit weird to say that an interchange format (so something with a beginning, an end and a direction) has some "unordered" part. The JSON grammar itself seem to contain an implicit order: JSONMemberList : JSONMember JSONMemberList , JSONMember
So I don't know. Does it really make sense to define JSON objects as an unordered collection in the interchange format (in-memory objects are a different story) or is it just an ECMAScript legacy?
I don't know about insertion order being important, but certainly it's natural to want to express order with object literals (or equivalently, JSON). If we can't rely on enumeration order matching key ordering in the literal, then the programmer has to express that ordering twice: once implicitly in the literal and once explicitly through whatever data structure is chosen to represent that order. There would seem to be some (seemingly) unnecessary repetition going on there, from the programmer's point of view.
On Mar 14, 2011, at 11:02 AM, John Tamplin wrote:
On Mon, Mar 14, 2011 at 10:21 AM, Brendan Eich <brendan at mozilla.com> wrote: Web developers find and exploit many de-facto standards. Enumeration order being insertion order for non-arrays at least, if not for all objects (arrays tend to be populated in index order), is old as the hills and web content depends on it, as far as I can tell. I'll do some experiments to try to get data on this.
Aside from the JSON example of populating a dropdown list given (which I will agree is a real if contrived use case), there has been a lot of talk of "thousands of web developers" depending on preserving insertion order, but not one concrete example -- do you have one?
We haven't tried changing objects other than "dense arrays" from insertion order, but other browser implementors have. Opera when Lars T. Hansen was working there did, and broke a Wolfenstein port, I believe. Of course they restored compatibility :-).
People in the know at other vendors, please weigh in.
We did try to stop suppressing properties deleted after the for-in loop began, per discussions last May in TC39. That bounced off the web:
bugzilla.mozilla.org/show_bug.cgi?id=569735, bugzilla.mozilla.org/show_bug.cgi?id=595963 (see duplicates)
On 3/12/2011 2:08 AM, Claus Reinke wrote:
I notice that you don't count initialization for the native Object variant. Apart from fixing that,
This is not a flaw. The initialization phase for the orderedMap creates an index. This is not needed for Object because the Object is the index.
would the following variant help (run, but not tested;-)? It keeps the previous/next key in the key's internal value, to reduce overhead.
The primary differences is that your code creates one Array per property, while my code uses 3 Objects to store unlimited properties via lots of slots.
This makes your code about 4x slower on IE6, where Object/Array allocation has painfully high penalties.
On other browsers, the results are less clear: straight timings are not directly comparable since at the end of the test, my code has no garbage, whereas yours has orphaned 15,000 arrays that the garbage collector needs to clean up.
To sum up, by using different implementations in different browsers, it may be possible to implement an order-preserving map in JavaScript and be within 5-6x of native speed.
Which in turn means, a common use case is penalized badly if Object loses order for numeric properties, whereas only relatively uncommon use cases (eg crypto in JavaScript) are penalized by preserving order for numeric properties.
However, critically, these uncommon use cases can regain full speed by using Arrays for objects with lots of numeric properties, but there is no known way to get back to full speed for the common use case of needing an ordered map.
Finally, this is just one part of the overall argument. This effect combines with the backcompat issue, loss of information going from Object literals to live Objects, less compact code and higher allocation/GC costs, etc.
On 3/13/2011 2:07 PM, Brendan Eich wrote:
On Mar 13, 2011, at 3:58 PM, Charles Kendrick wrote:
Brendan, Bradley and others - there's no need to search for a relatively uncommon use case for ordered maps indexed by keys - a far more common use case is enumerated data stored numerically, eg, marital status:
{ 2: "Married", 1: "Single", 5: "Divorced" }
It's still a synthetic example, and real web sites/apps would be more compelling and helpful for forge consensus.
There's two separable and mutually reinforcing lines of argument here and the appropriate evidence is different for each:
- backcompat issue
Here the primary evidence of developer pain is the Google Code issue with the record-setting 127 stars and 116 comments
http://code.google.com/p/v8/issues/detail?id=164
And these two duplicates:
75 stars, 22 comments: code.google.com/p/chromium/issues/detail?id=37404
32 stars, 17 comments: code.google.com/p/chromium/issues/detail?id=20144
I have not read everything in detail but there is at least one instance of someone pointing out that part of Facebook was broken by this in Dec 2010:
http://code.google.com/p/v8/issues/detail?id=164#c81
If you slog through the rest of the comments on this and other issues closed as duplicates, you find a bunch of people complaining about broken apps/sites, saying that they are advising people not to use Chrome, etc, but, most are posting with generic email addresses and do not name the specific sites.
People are asking me on and off-list whether I can provide specific sites, but, I have no special access to know what these sites are. I have around 10-12 definite reports from our own customers getting bitten by assuming order was preserved and thinking it was a bug in the framework, where we've had to give them the bad news. These are all banks / insurers / defense apps, behind the firewall.
My intuition is that most of the posters in the Google Chrome issue are likewise behind-the-firewall business applications. This is no way diminishes the importance of the applications that all these developers are posting about - if they are indeed mostly behind-the-firewall applications, then they deal with money and business processes and other high-value stuff.
- usability / functional issue
By this I mean:
-
the performance value of having Object be a native-speed order-preserving map implementation: 4-6x faster than the fastest JavaScript implementation
-
the loss of information going from an Object literal to a live Object if order is not preserved
-
more verbose code / data definitions in various use cases
-
the "gotcha" factor of Object appearing to preserve order but then dropping order for numeric keys, a reasonable behavior for Array but a surprise on Object
.. etc.
Here I don't think it makes sense to ask for an Alexa site or similar - we're talking about how common the use case is, not how popular the site is, and in this case I think the use case comes up most often in sites that would never appear in the Alexa 500, because they are business applications.
I have presented two main arguments about how common the use case is:
-
if you look through available collection classes in the class libraries of various languages, you will very often find an order-preserving map, but you won't find an implementation of a map that preserves order for string keys but not numeric. To me, this strongly implies that the order-preserving map is a commonly desired behavior whereas a map that specifically drops order for numeric keys does not correspond to any common use case
-
order-preserving numerically-keyed maps are very common in any application that involves relational storage. Since this crowd didn't receive this as an obvious statement in the way I expected, let me provide some further information:
If you consider enumerated fields (Order Status, Bug Status, Sex, Package Type, Employment Status, etc - they are ever-present) across all object-relational mapping systems across any programming language (Rails/Grails, Hibernate, Django, etc), storing such fields as numeric values is either the default or is an available option.
Likewise if you consider relations between objects (Account->Manager), typically an integer id
is assigned to the record and stored in the related record.
For either type of information, when it is being delivered to the UI to be displayed, one very natural representation is a map from unique integer id to display value, eg Bug Status:
{ 2: "Fixed", 1: "Verified", 5: "Duplicate", ... }
.. or search results (eg a combobox):
{ 43443: "Mark Mitchell", 43113: "Mark Pierce", ... }
This latter use case is basically what was broken on Facebook.
Hopefully this is enough to make it clear that the use case is very common. Someone could, of course, go off scouring various web sites to try to find status fields stored as integers or related records keyed by numeric ids, but again, I would assert that this use case is most common in behind-the-firewall apps, so information about the Alexa 500 would be neither here nor there.
I hope that with the above context, it can be taken as obvious that "ordered data with numeric keys" is a common use case.
On Mar 14, 2011, at 9:42 AM, David Bruant wrote:
Le 14/03/2011 17:02, John Tamplin a écrit :
On Mon, Mar 14, 2011 at 10:21 AM, Brendan Eich <brendan at mozilla.com> wrote: Web developers find and exploit many de-facto standards. Enumeration order being insertion order for non-arrays at least, if not for all objects (arrays tend to be populated in index order), is old as the hills and web content depends on it, as far as I can tell. I'll do some experiments to try to get data on this.
Aside from the JSON example of populating a dropdown list given (which I will agree is a real if contrived use case), there has been a lot of talk of "thousands of web developers" depending on preserving insertion order, but not one concrete example -- do you have one? I gave one esdiscuss/2011-March/013036 It is theorical (sorry, not concrete like "there is code using it right now"), but the point is that for objects (actual objects not arrays) used as "dictionaries", numbers could be used as "alphabetic keys". When the user writes objects as object literals in code, they might (not a single proof since it has never been the case) appreciate if the JS engine kept the order they have written the key in.
The order the user provides the keys in is the only bit of information (I can think of) that the JS engine looses. But once again, if users would have appreciate this feature, they have been forced (by spec and implementations) to find other ways. So, unless we can reach all web devs to ask "have you ever been disappointed of the implementation for-in loop order?", you cannot really have facts on if they would use the feature. Apparently, for the case of non-numeric properties, they seem satisfied of the implementation which iterate them over
For more concrete numerical results, let's wait for Brendan to do his experiments (or help him out if there is any way to do so?)
I think that in-memory objects (created as such) for-in enumeration and JSON.parse created objects are two different concerns. Of course, they should certainly be solved consistently, but they are different in my opinion.
I am a bit worried about JSON. JSON as an interchange format is defined as unordered. "An object is an unordered collection of zero or more name/value pairs". If people expect JSON as an interchange format to be ordered, they are assuming something about the JSON parse method which isn't itself standardized either.
If you inspect any JSON document that conforms to the RFC 4627 (for example, {"y": 1 , "z":2, "x":3}) will see that each object clearly has some specific ordering of its name/value pairs. It must, it is inherent in such a textual representation. "y" is first, then "z", and finally "x". So the language in 4627 that says "An object is an unordered collection of zero or more name/value pairs" clearly can not be saying that the name/value pairs do not occur in some order. It must mean something else. Crock can probably tell us exactly what he intended those words to mean. Until he chimes in, I'll make an educated guess about the meaning. It means that the order of occurrence of key/value pairs should not be assigned any semantic meaning. That, {"y": 1 , "z":2, "x":3} and {"x": 3 , "z":2, "1":1} and the 4 other possible permutations of these key/value pairs must all be considered semantically equivalent.
If you are defining a JSON-based schema for some application and that schema assigns semantic significance to key/value pair orderings you are doing sometime outside the bounds of RFC 4627. You may well get away with doing so, but there is no guarantee that when somebody tries to ready your JSON document using the FORTRAN JSON library that the ordering information you thought was implicit will actually be visible to them.
BTW, there is a very straightforward way to encode a ordered set of key/value pairs using 4627 JSON: ["y",1,"z",2,"x",3]. It's the exact same length as the "object" encoding. If you want to easily decode it into an object (assume insertion ordering) you might encode it as ["ordered-n/v","y",1,"z",2,"x",3] and then process the document using JSON.parse with a reviver function like:
function reviver(k,v) { if (!Array.isArray(v)) return v; if (v[0]!=="ordered-n/v") return v; var o={}; var len=v.length; for (var i=1;i<len;) o[v[i++] ]= v[i++]; return o; }
One conceptual issue I see with ordering in the ES spec JSON.parse-created objects is that the ECMA spec would explicitely say "we consider that objects are ordered and we decide an order for JSON.parse-created objects, (we decide of an order while there isn't in the JSON format)". Once again, it's conceptual. I would be in favor of adding an order, but it had to be said. All of that said, it's a bit weird to say that an interchange format (so something with a beginning, an end and a direction) has some "unordered" part. The JSON grammar itself seem to contain an implicit order: JSONMemberList : JSONMember JSONMemberList , JSONMember
So I don't know. Does it really make sense to define JSON objects as an unordered collection in the interchange format (in-memory objects are a different story) or is it just an ECMAScript legacy?
In talking about JSON you need to be careful not to conflate the JSON interchange format defined by RFC 4627 with the serialization process required to convert that format to/from the object model of your favorite programming language. These are two separate things. While, 4627 says name/value pairs are unordered, that doesn't prevent a JSON serialization library from specifying such ordering in its definition of the mapping between 4627 and language "objects" (or whatever language elements are used in the mapping). ES5 does this for JSON.stringify when it says that the own properties are processed in the same order as produced by Object.keys. It also specifies this for JSON.parse in step 3 when it says process the JSON document as if it was a ECMAScript Program. This means the any property ordering that is implicit in processing object ECMAScript object literals will also be applied to generated objects corresponding to JSONObjects. (Where this falls short, is where ES5 for reasons already discussed doesn't require a specific enumeration order).
The reason the ES5 JSON spec talks about property ordering, even though 4627 does not, is because we would really like JSON.stringify when run
---------- Forwarded message ---------- From: Bradley Meck <bradley.meck at gmail.com>
Date: Mon, Mar 14, 2011 at 11:41 PM Subject: Re: iteration order for Object To: charles.kendrick at isomorphic.com
Very nice examples. While I do think programming which requires this is a bit odd it seems to be quite relied upon. I would like to see the couple of comments peppered in by the developer team in the chrome bugs about performance issues. Otherwise it seems that the fact that we have what appears to be an expectation of this with examples of breaking changes (though many comments are missing these examples). The fact that it has become an expectation for so many leads me to agree that this should be standardized as long as there are no lasting considerations. My concern for memory/performance affecting things outside of for...in loops remains, but I am not close enough to engine internals to ensure this to be a non-issue. Overall, seems this needs standardization. Keep it with the most backward compatible since it is a standard that has been reverse engineered by implementation (and w3schools it seems).
I believe it is very very important that the ECMAScript standard specify that when a new Object is created, for..in iteration traverses properties in the order they are added, regardless of whether the properties are numeric or not.
The primary reason is that I think it makes ECMAScript a much more expressive and usable language, and results in better performance in real-world applications.
Secondarily, the fact that the Chrome browser has deviated from this de-facto standard is creating a small crisis for site owners and web application developers, and it will get much much worse if any other browser vendors follow suit.
This seems to have been most recently discussed in 2009 with inconclusive results.
I have summarized the argument for this feature below - this argument has swayed others who were initially opposed.
I'm hoping we can get quick consensus that specifically Object iteration order should be preserved, without getting too bogged down in the details of specifying exactly what happens for Arrays, prototype chains, etc. Major stakeholder agreement on this one aspect should be enough to prevent any other vendors from shipping browsers that break sites, and get the Chrome bug for this re-instated.
== Expressiveness and Performance argument
A very common use case where order preservation is desirable is providing the set of options for a drop-down list in JavaScript. Essentially all Ajax widget kits have such an API, and usage is generally:
Here are some examples of what alternatives might look like - all of them are far, far worse:
.#1 Parallel arrays:
this is awkward and unnatural, and doesn't correspond to how a list of options is specified in HTML
involves double the number of allocations / GC-tracked objects (6 strings and 2 Arrays vs one Object and 3 Strings - slots don't GC)
replacing a key/value pair requires a linear (0(n)) search unless secondary indexing approaches are used, which requires yet more allocation both to build and maintain the index, as well as a level of sophistication not typical for a scripting language user
.#2 Array of Objects
verbose and redundant code - reiterates "value" and "text" once per entry
much worse Object allocation than #1 (which was already bad): one Object + 2 Strings per property
same linear search / extra allocation / developer sophistication issue as #1
.#3 Array of Arrays
verbose, finger-spraining punctuation density
much worse Object allocation than #1 (which was already bad): one Array + 2 Strings per property
same linear search / extra allocation / developer sophistication issue as #1
In a nutshell, dropping order preservation results in:
less expressive code
more bytes on the wire (both in code-as-such and JSON)
degraded application performance via increased allocations and the overhead of implementing order-preserving behavior in JavaScript
== Historical behavior argument
All browsers that have ever had non-negligible market share have implemented order-preserving Objects - until Chrome 6.
Like many universally consistent, obviously beneficial behaviors, many developers relied on it assuming eventual standardization.
Thousands of sites and applications are broken by Chrome's decision to drop the order-preserving behavior. There is a bug against Chrome's V8 engine (currently marked "WorkingAsIntended"):
People can "star" issues to be notified of changes in status or discussion. This issue has by far more stars than the most-starred Feature Request (E4X support), more than double the stars of the runner-up, and more stars than roughly the top 20 confirmed bugs combined.
And this does not consider all the stars on other versions of this issue that were closed as duplicates.
Various arguments have gone back and forth on whether Chrome should fix this bug without waiting for standardization, but not a single person has indicated that they would prefer that Object does not preserve order.
In a nutshell, there is overwhelming support for adding this behavior to the standard, and still time to avoid all the wasted effort of changing all these sites and applications. Very few non-order-preserving browsers exist in wild, and the behavior is limited to browsers that are updated very frequently or even automatically.
== Objections and counter-arguments
Yes, this is true. I am proposing only that Object preserves insertion order, not Array.
No developers or sites rely on Array for..in iteration order, since it was never consistent.
If Array for..in iteration continues to be unordered, any developer that cares about the tiny performance difference can use an Array to store non-numeric property/value pairs.
It can't be a very large optimization, since Safari and Firefox continue to challenge Chrome's performance while maintaining in-order iteration. And if it's only a small optimization, then obviously it's completely dwarfed by the application-level penalties incurred re-implementing this key behavior in JavaScript.
This is an abysmal compromise, with the worst traits of each alternative. It requires browser vendors to implement order preservation, such that we don't get the minor optimization that's possible from not preserving order at all. At the same time, it requires that applications and frameworks deal with lack of order for numeric keys, which are very common: in the use case of mapping stored to displayed values, stored values are very often numeric.
It's also just bad design. It surprising and counter-intuitive that numeric keys are treated differently from non-numeric. The reality is that an implementation detail of Array is bleeding through to Object.