Array.prototype.concat result length (ES5.1)

# Andrew Oakley (14 years ago)

I'm not sure the Array.prototype.concat algorithm (§ 15.4.4.4) is quite right.

As far as I can see the length property of the result (A) is not explicitly set anywhere in the algorithm so we are relying on the [[DefineOwnProperty]] method of A to set the length for us when we add indexed properties to it.

If we pass "sparse" arrays to the function then I think the results should look like this:

[,,,,].concat([]) => [] (length 0) [,x,,].concat([]) => [,x] (length 2)

This is not what browsers do and is a significant change from ECMAScript 3 (which set the length of A at the end of the algorithm)

Is it reasonable to add another step at the end of the algorithm to make it match browser behaviour better:

Call the [[DefineOwnProperty]] internal method of A with arguments "length", Property Descriptor {[[Value]]: n}, and false.

The other complication I've spotted (I think it crops up elsewhere too) is what to do if n becomes larger than the maximum array index. My reading of ECMAScript 3 says that we should throw a RangeError but nobody seems to do that.

Opera and Firefox seem to store n as a 32bit unsigned number - the length wraps and they start putting properties at the beginning of the array again. I'm not sure what Chrome is doing (I can't find my values in the returned array), IE says "out of memory" and I got bored waiting for Safari.

# Yusuke Suzuki (14 years ago)

I also reported it to es5-discuss mail.mozilla.org/pipermail/es5-discuss/2010-December/003851.html, and got " This is a bug in the spec."

I think [[Put]]("length", {[[Value]]: n}, true) step should be inserted to the end of this(like, pop, push method), is it correct?

# Allen Wirfs-Brock (14 years ago)

(don't miss the second part of my response)

On Jul 14, 2011, at 5:21 AM, Andrew Oakley wrote:

I'm not sure the Array.prototype.concat algorithm (§ 15.4.4.4) is quite right.

As far as I can see the length property of the result (A) is not explicitly set anywhere in the algorithm so we are relying on the [[DefineOwnProperty]] method of A to set the length for us when we add indexed properties to it.

If we pass "sparse" arrays to the function then I think the results should look like this:

[,,,,].concat([]) => [] (length 0) [,x,,].concat([]) => [,x] (length 2)

This is not what browsers do and is a significant change from ECMAScript 3 (which set the length of A at the end of the algorithm)

Is it reasonable to add another step at the end of the algorithm to make it match browser behaviour better:

Call the [[DefineOwnProperty]] internal method of A with arguments "length", Property Descriptor {[[Value]]: n}, and false.

Yes, this is a bug see mail.mozilla.org/pipermail/es5-discuss/2010-December/003852.html Now ecmascript#129

I guess it is time to fire up the ES5.1 Errata...

BTW, a [[Put]] for length at the end will work fine.

The other complication I've spotted (I think it crops up elsewhere too) is what to do if n becomes larger than the maximum array index. My reading of ECMAScript 3 says that we should throw a RangeError but nobody seems to do that.

Opera and Firefox seem to store n as a 32bit unsigned number - the length wraps and they start putting properties at the beginning of the array again. I'm not sure what Chrome is doing (I can't find my values in the returned array), IE says "out of memory" and I got bored waiting for Safari.

This appears to be a (possible) specification bug that first occurs in ES3 and which was not corrected in ES5.

I appears in concat and push. But no other algorithms. The problem occurs when when a length value is incremented and then used as an array index without first applying ToUint32 to the value. It only occurs concat and push because all the other algorithms are bounded by a length value that has had ToUint32 applied to it.

It is probably a bug, because array index based operations generally warp around to 0 at 2^32. I assume that the Firefox implementation was the initial one and it reflects that behavior. However, all array objects (ad generic array like objects) are permitted to have integer property names >= 2^32. These properties are not reflected in the value of the 'length' property (when set by generic array methods for array-like objects). So, there is a slim chance that the specified behavior was intentional in allowing concat and push to leak past the 2^32 element boundary.

We have to possible courses of action:

  1. consider the specification to be correct as written.
  2. consider this to be a spec. but that we correct in the errata and in future editions.

This is an edge case that probably never occurs in the real world so the choice probably has no real impact. #1 is most consistent with the rest of the ES spec. However, the 2^32 length wrap semantics of arrays is kind of bogus and it might be nice to try to eliminate it in the future. #2 would be a step away from that.

this now bug: ecmascript#131

# Brendan Eich (14 years ago)

On Jul 14, 2011, at 10:04 AM, Allen Wirfs-Brock wrote:

This appears to be a (possible) specification bug that first occurs in ES3 and which was not corrected in ES5.

I appears in concat and push. But no other algorithms. The problem occurs when when a length value is incremented and then used as an array index without first applying ToUint32 to the value. It only occurs concat and push because all the other algorithms are bounded by a length value that has had ToUint32 applied to it.

FYI, SpiderMonkey (Firefox) results:

js> a=[] [] js> a.length=0xffffffff

4294967295 js> b=a.concat([1]) [] js> b.length

0

js> a=[] [] js> a.length=0xffffffff

4294967295 js> a.push(1)

typein:3: RangeError: invalid array length

It is probably a bug, because array index based operations generally warp around to 0 at 2^32. I assume that the Firefox implementation was the initial one and it reflects that behavior.

SpiderMonkey in Firefox 3 era does the same as above.

However, all array objects (and generic array like objects) are permitted to have integer property names >= 2^32. These properties are not reflected in the value of the 'length' property (when set by generic array methods for array-like objects). So, there is a slim chance that the specified behavior was intentional in allowing concat and push to leak past the 2^32 element boundary.

I doubt anything was intentional here, but I wasn't in on this part of ES3 in detail.

Jeff Walden had to do a bunch of work in SpiderMonkey to avoid wrongly wrapping uint32 index computations that must overflow to double or (safely constrained) uint64 before being stored as property names.

I don't think length should ever wrap around for a property name that marches past 0xffffffff. That's just asking for trouble. It's not as bad as the equivalent bug in unsafe languages (size overflow resulting in buffer under-allocation and indexing overruns), but it's in that direction, and a separate bug could combine to make a nasty exploit.

We have to possible courses of action:

  1. consider the specification to be correct as written.
  2. consider this to be a spec. but that we correct in the errata and in future editions.

This is an edge case that probably never occurs in the real world so the choice probably has no real impact. #1 is most consistent with the rest of the ES spec. However, the 2^32 length wrap semantics of arrays is kind of bogus and it might be nice to try to eliminate it in the future. #2 would be a step away from that.

this now bug: ecmascript#131

Would appreciate Jeff's thoughts. I've learned my lesson many times on wraparound being wrong, so I'm in favor of RangeError. We do that for push but not concat in SpiderMonkey, so AFAIK we could just fix concat, as you note no one would care except our testsuites, and we'd be safer and more consistent at the margin.

# Allen Wirfs-Brock (14 years ago)

On Jul 14, 2011, at 10:39 AM, Brendan Eich wrote:

On Jul 14, 2011, at 10:04 AM, Allen Wirfs-Brock wrote:

This appears to be a (possible) specification bug that first occurs in ES3 and which was not corrected in ES5.

I appears in concat and push. But no other algorithms. The problem occurs when when a length value is incremented and then used as an array index without first applying ToUint32 to the value. It only occurs concat and push because all the other algorithms are bounded by a length value that has had ToUint32 applied to it.

FYI, SpiderMonkey (Firefox) results:

js> a=[] [] js> a.length=0xffffffff 4294967295 js> b=a.concat([1]) [] js> b.length 0

js> a=[] [] js> a.length=0xffffffff 4294967295 js> a.push(1) typein:3: RangeError: invalid array length

The range error comes from the special [[DefineOwnProperty]] internal method of array objects. According to the spec. the above push test should create a property with key "4294967295" (not an array indexed property) and then throw (when the length is set). So, a should sill have the added property if examined after the exception. If a is not an array instance it should get the property and length should get set to 4294967296.

The reason, that the concat test doesn't throw is probably because of the missing [[Put]] of length that was the original topic of this thread. ...

We have to possible courses of action:

  1. consider the specification to be correct as written.
  2. consider this to be a spec. but that we correct in the errata and in future editions.

This is an edge case that probably never occurs in the real world so the choice probably has no real impact. #1 is most consistent with the rest of the ES spec. However, the 2^32 length wrap semantics of arrays is kind of bogus and it might be nice to try to eliminate it in the future. #2 would be a step away from that.

this now bug: ecmascript#131

Would appreciate Jeff's thoughts. I've learned my lesson many times on wraparound being wrong, so I'm in favor of RangeError. We do that for push but not concat in SpiderMonkey, so AFAIK we could just fix concat, as you note no one would care except our testsuites, and we'd be safer and more consistent at the margin.

But as noted above, the range error is only specified for actual array instances and not for generic array-like objects.

What about the possibility of simply eliminating the range error and the Uint32 restriction

# Brendan Eich (14 years ago)

On Jul 14, 2011, at 12:35 PM, Allen Wirfs-Brock wrote:

What about the possibility of simply eliminating the range error and the Uint32 restriction on the association between array indexed properties and the "length" property. Instead replace it with a ToInteger constraint. This is essentially how string operations are defined. Implementations could still optimize for lengths <2^32 and or any other size they deemed appropriate.

I would prefer that. Jeff may agree.

The uint32 business from ES1 days never paid off beyond allowing certain storage optimizations, and even then many (most?) engines do not optimize uint32 values. It was a flop on the optimization front, but it sure requires extra code in Array's implementation.

ToInteger matches string and makes full use of the integral domain in the number type.

The edge-case nature of this change suggests we could get away with it, even though it's an incompatible change. But we need to be careful that we're not turning errors into working code that could be exploited somehow. Again there is no memory safety issue, rather "index safety".

# Mark S. Miller (14 years ago)

Would we just be postponing all those same edge cases from 232 to 253?

# Brendan Eich (14 years ago)

On Jul 14, 2011, at 2:14 PM, Mark S. Miller wrote:

Would we just be postponing all those same edge cases from 232 to 253?

No, because you lose precision adding 1 to an index to get length after that. No bignums.

# Jeff Walden (14 years ago)

On 07/14/2011 10:04 AM, Allen Wirfs-Brock wrote:

It is probably a bug, because array index based operations generally warp around to 0 at 2^32.

Freudian slip? :-D

Easiest fix is to just add the length-set to concat. For a quick ES5 erratum that seems best to me.

Removing all the RangeError stuff, and making array indexes just non-negative integers, would be nice for ES6 or similar. I suspect changing that won't break anyone worth caring about, although I do know some people have taken the time to care about this in the past (mostly in a spec-nut way :-) ):

hexmen.com/blog/2006/12/push-and-pop

Without having thought too hard about exactly what would be involved, I suspect the amount of stuff you'd need to adjust, and the complexity of checking for sane behavior in all cases (including some of the 2**52 upper-bounding edge cases, depending on what new semantics you might want for "array indexes" or whatever), would make it unwise to try for ES5 errata. But I could well be wrong about this, so I wouldn't necessarily write off that possibility.

# Allen Wirfs-Brock (14 years ago)

On Jul 18, 2011, at 10:51 AM, Jeff Walden wrote:

Easiest fix is to just add the length-set to concat. For a quick ES5 erratum that seems best to me.

Yes, this needs to be in the errata.

Removing all the RangeError stuff, and making array indexes just non-negative integers, would be nice for ES6 or similar. I suspect changing that won't break anyone worth caring about, although I do know some people have taken the time to care about this in the past (mostly in a spec-nut way :-) ):

hexmen.com/blog/2006/12/push-and-pop

Without having thought too hard about exactly what would be involved, I suspect the amount of stuff you'd need to adjust, and the complexity of checking for sane behavior in all cases (including some of the 2**52 upper-bounding edge cases, depending on what new semantics you might want for "array indexes" or whatever), would make it unwise to try for ES5 errata. But I could well be wrong about this, so I wouldn't necessarily write off that possibility.

We won't do this as an errata. It is a change that would have to be treated as a change in a future edition rather than an error correction of the current edition.

# liorean (14 years ago)

On 07/14/2011 10:04 AM, Allen Wirfs-Brock wrote:

It is probably a bug, because array index based operations generally warp around to 0 at 2^32.

On 18 July 2011 19:51, Jeff Walden <jwalden+es at mit.edu> wrote:

Removing all the RangeError stuff, and making array indexes just non-negative integers,  would be nice for ES6 or similar.  I suspect changing that won't break anyone worth caring about, although I do know some people have taken the time to care about this in the past (mostly in a spec-nut way :-) ):

hexmen.com/blog/2006/12/push-and-pop

Hmm. That link has the following to say:

Steps 3 – 6 of push are a little ambiguous, and the specification probably should have stated that repeated increments to n must be done using 32-bit unsigned integer arithmetic – it’s kind-of implicit as n is assigned the result of the internal ToUint32 operator. Using 32-bit arithmetic leads to some strange edge-cases: when n overflows from 232-1 to 0, push will have set a property called 4294967295. This is strange, as 429496795 is not an array index (as discussed above), but at least it means the property-value will still be available after the inevitable array-truncation (when length is set to some small value in step 8.)

That is actually contrary to my reading of ECMA-262 3ed. intentions. I assume 5 ed. has the same intentions, but I have not checked it. The way I read those intentions is as you can see in the error I reported in my comment to: blogs.msdn.com/b/jscript/archive/2008/03/25/performance-optimization-of-arrays-part-i.aspx (Opera fixed a different but related bug in the same algorithms in Futhark to follow the correct handling (as I read the spec). Carakan today I don't know about.)

For those not wanting to go searching for my comment and sifting through that text: The algorithm in question uses ToUInt32 to convert the value, but the storage is not as uint32 but as pretty much everywhere in ECMAScript, Number, thus follows normal double arithmetics. This is particularly of note as it means that it should not wrap around - the purpose of the following parts of 15.4.5.1 [[Put]] (P, V):

  1. Compute ToUint32(V).

  2. If Result(12) is not equal to ToNumber(V), throw a RangeError exception.

  3. For every integer k that is less than the value of the length property of A but not less than Result(12), if A itself has a property (not an inherited property) named ToString(k), then delete that property.

  4. Set the value of property P of A to Result(12).


is to, at step 13, catch the specific event of trying to exceed uint32 size with the length property and throw an error instead of wrapping around and as a result of step 14 of above algorithm destroy properties on the array. Wrapping around is an error, because ToUInt32 gives not a uint32 but a Number which can fit into a unit32 as result. IIRC the 3ed. spec never uses any other number format than Number, it only performs the operations to fit an input Number type into the limitations of those other number types, into an output of Number type.

# Allen Wirfs-Brock (14 years ago)

Yes, liorean's analysis below seems correct. If the final length is a value >= 2^32 then attempting to set it on an array object will throw a RangeError. However, this special behavior is only for array objects. If push is used generically with non-array objects then "n" can reach a greater value and the final length will be set appropriately. Where a wrap could occur would be if the push or another similar array method was subsequently applied to such a non object with a huge length property value. These methods generally apply ToUnit32 to the length value when they initially retrieve it. So operations upon a non-array object with a huge length will start at some warped position.

This is the same in both ES3 and ES5. I'd argue that the initial ToUint32 in these algorithms is really a bug that wasn't caught long ago. For real arrays its is unnecessary as the special internal method treatment guarantee's it is already a uint32 and is unnecessary. For non-arrays there are the array methods don't generally clamp either indices or length values to 32-bits on stores. So the initial to uint32 at the beginning of these algorithms is unnecessary for real arrays and is potentially corrupting for non arrays. It probably really should be doing a ToInteger (or perhaps the non-existent ToUInteger) to guarantee it is working with an integer length.

I filed a change request for the ES.next draft to eliminate the uint32 length restriction on arrays. It is ecmascript#145, ecmascript#146 is a bug against the initial length conversion for non-array object in the array methods.