When expecting a positive integer...

# Domenic Denicola (12 years ago)

I notice that operations accepting positive integers do various somewhat-inconsistent things as of the latest ES6 spec:

Various typed array things do ToPositiveInteger, a new operation as of ES6:

  • ArrayBuffer(length) and TypedArray(length)
  • Constructing a typed array, and TypedArray.prototype.set, coerce both the passed object's length property and the numeric byte-offset argument
  • Most interestingly, TypedArray.prototype.@@elementGet(index) and @@elementSet(index, value) do a ToPositiveInteger on index. So typedArray[-123] === typedArray[0], and typedArray[-321] = 5 gives typedArray[0] === 5.

Whereas arrays use ToUint32, but sometimes throw RangeErrors:

  • Indices generally "work" if ToString(ToUint32(index)) !== index; otherwise they're treated as ordinary properties.
  • When getting a length from any array-like, you just do ToUint32 on it.
  • But when setting a length, you validate so that if ToUint32(length) !== length, a RangeError is thrown.

Finally, one lone place throws RangeErrors after conversion:

  • String.prototype.repeat(count) throws if ToInteger(count) is negative or infinite.

I was curious if there were any interesting rationales driving these decisions. (And I fully appreciate there might not be; it could just be compatibility constraints.) Or, to put it another way, I was curious what would be considered "idiomatic" JS, e.g. in the same way that the spec sets a precedent of ToString-ing all string inputs to its functions.

# Dmitry Lomov (12 years ago)

Regarding TypedArrays:

  • ToUint32 is definitely not sufficient for typed array indices because typed arrays are allowed to be longer that 2^32.
  • However the specified "clamping" of negative feels counter-intuitive to me: under the current spec "arr[-32] = 42" should assign 42 to arr[0]. This is not what current implementations of typed arrays, and this semantics seems confusing and dangerous.

Also, the systematic usage of ToPositiveInteger throughout the typed array/array buffer spec causes some counter-intuitive (or, at least, legacy-incompatible) behavior. For example, under the current spec "new Float32Array(-10)" silently allocates an array of length 0. Current typed array implementations throw in that case, and I think a case could be made that throwing here is better than allowing sloppy code.

If people agree this is generally a thing to be avoided, I am happy to collect a systematic list of these issues and suggest fixes - but maybe I am missing something and that has some deep motivation?

Kind , Dmitry

# Brendan Eich (12 years ago)

Dmitry Lomov wrote:

If people agree this is generally a thing to be avoided, I am happy to collect a systematic list of these issues and suggest fixes - but maybe I am missing something and that has some deep motivation?

No, please collect and file at bugs.ecmascript.org -- these are indeed errors in the draft. We need to throw on negative length. We must not spec clamping negative indexes to 0 at runtime. Other deviations from Khronos and implementation need to be considered carefully in light of performance and safety (which are not always at odds).

Thanks to you and Domenic for flagging.

# Brendan Eich (12 years ago)

Brendan Eich wrote:

We need to throw on negative length

Sorry, that was unclear: I was agreeing that implementations, as you observed, throw from their constructors on negative length and this needs to be spec'ed.

# Kevin Gadd (12 years ago)

I previously had a discussion with someone about Typed Array sizes in particular - at present it seems like no existing implementation of Typed Arrays will allow you to allocate one larger than 2GB, regardless of the actual numeric types being used. But when I did a quick scan of the Safari, Chrome and Spidermonkey implementations, I found some uses of ToInt32 and equivalent operations instead of ToUInt32 - which would imply being limited to a maximum index that fits into a positive 32-bit integer.

Being able to allocate a 4GB typed array on a 64-bit machine, if not one even bigger than that, would certainly be welcome.

# Mark S. Miller (12 years ago)

Consider instead the [Nat operation](<code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/ses/startSES.js#412):

var MAX_NAT = Math.pow(2, 53);
  function Nat(allegedNum) {
    if (typeof allegedNum !== 'number') {
      throw new RangeError('not a number');
    }
    if (allegedNum !== allegedNum) { throw new RangeError('NaN not natural'); }
    if (allegedNum < 0)            { throw new RangeError('negative'); }
    if (allegedNum % 1 !== 0)      { throw new RangeError('not integral'); }
    if (allegedNum > MAX_NAT)      { throw new RangeError('too big'); }
    return allegedNum;
  }

This returns allegedNum only if it is a non-negative integer within the range of consecutively representable non-negative integers. For what uses of ToPositiveInteger would Nat not be better? Btw, given the spec ToPositiveInteger is badly misnamed. +0 is not a positive integer. It is a non-negative integer. +Infinity is not an integer. -Infinityis neither positive nor an integer.

# Dmitry Lomov (12 years ago)

Current spec places no limits (other than being a number) on typed array/array buffer length

# Allen Wirfs-Brock (12 years ago)

On Apr 8, 2013, at 11:05 PM, Domenic Denicola wrote:

I notice that operations accepting positive integers do various somewhat-inconsistent things as of the latest ES6 spec:

Various typed array things do ToPositiveInteger, a new operation as of ES6:

  • ArrayBuffer(length) and TypedArray(length)
  • Constructing a typed array, and TypedArray.prototype.set, coerce both the passed object's length property and the numeric byte-offset argument
  • Most interestingly, TypedArray.prototype.@@elementGet(index) and @@elementSet(index, value) do a ToPositiveInteger on index. So typedArray[-123] === typedArray[0], and typedArray[-321] = 5 gives typedArray[0] === 5.

Whereas arrays use ToUint32, but sometimes throw RangeErrors:

  • Indices generally "work" if ToString(ToUint32(index)) !== index; otherwise they're treated as ordinary properties.
  • When getting a length from any array-like, you just do ToUint32 on it.
  • But when setting a length, you validate so that if ToUint32(length) !== length, a RangeError is thrown.

Finally, one lone place throws RangeErrors after conversion:

  • String.prototype.repeat(count) throws if ToInteger(count) is negative or infinite.

I was curious if there were any interesting rationales driving these decisions. (And I fully appreciate there might not be; it could just be compatibility constraints.) Or, to put it another way, I was curious what would be considered "idiomatic" JS, e.g. in the same way that the spec sets a precedent of ToString-ing all string inputs to its functions.

Remember this is just a working draft of the specification, and ToPositiveInteger is just a first cut at how to deal with array-like entities that are allowed to be larger than 2^32-2 in length. I try to remember to include a margin note tagging things like this as open issues, but in this case I guess I didn't. Things like this are the starting points for design discussions like this one so thanks for starting it.

Since we are entering somewhat uncharted territory WRT arrays, it isn't clear what precedents best apply and what would be most idiomatic. I decided to clamp negative indices to 0 based upon the precedent that Array does a Uint32 transformation of all numeric indices (ie, it transforms all values so they are a bounded, non-negative range) and never throws on "out of bound" accesses. ToPositiveInteger seemed like a reasonable first cut at for something analogous when dealing with unbounded lengths.

Throwing is another possibility, but one for which there is very little precedent for throwing in such situations. WebIDL says (via its parameter conversion rules) that this is what should happen for negative Typed Array indices, but browsers don't seem to conform to that.

A better precedent is probably the handling of string indexing which is specified to always returns undefined for negative string indices such as "abcd"[-2]. This also appear to be what at least some browser currently do for Typed Arrays.

# Kevin Gadd (12 years ago)

Fancy behavior for out of range indices on Typed Arrays seems like it could be more trouble than it's worth. Ideally, you want something that can be cheaply implemented on native targets like x86, if not implemented for free because it's something the runtime has to do anyway. Returning undefined seems like it would definitely imply a performance penalty, or at least make work a lot harder for type inference/analysis engines, because now you have to prove that all indices are in range.

Clamping is reasonable, but not necessarily what people might expect. Throwing at least does not suffer from the type information problem that undefined does, though I'm sure it still poses issues for JITs - same 'prove all indices are in range' problem to eliminate the bounds check. I could see wrapping being an acceptable choice as well, since that's sort of 'native' semantics.

Firefox appears to return undefined for out of range typed array elements; can someone comment on whether Spidermonkey is easily able to achieve this and whether it hurts the JIT and/or gathering of type information? Would some other behavior be faster? Chrome appears to do this too currently.

People really want typed arrays to be fast, so consider that the context for my comments here. :)

# Brendan Eich (12 years ago)

Kevin Gadd wrote:

Fancy behavior for out of range indices on Typed Arrays seems like it could be more trouble than it's worth. Ideally, you want something that can be cheaply implemented on native targets like x86, if not implemented for free because it's something the runtime has to do anyway. Returning undefined seems like it would definitely imply a performance penalty, or at least make work a lot harder for type inference/analysis engines, because now you have to prove that all indices are in range.

We already handle this in TI in SpiderMonkey, and in asm.js AOT compilation in OdinMonkey. True, we didn't want to change Typed Arrays incompatibly, so we made the best of it, but IIRC it wasn't a big deal.

Mainly, we have to face the problem of backward compatibility. If we could make all int arrays return 0 not undefined, ditto for float arrays viz. NaN, that would be even better for performance.

Clamping is reasonable

No, it makes all negative indexes alias 0. This is bad for SFI-enforcing compilers (Emscripten, Mandreel, PNaCl).

, but not necessarily what people might expect. Throwing at least does not suffer from the type information problem that undefined does,

Right, and (Allen points out this may have been missed by the Khronos editors) typed arrays based on WebIDL index getter/setters require throwing. No browser impl that I can test does this, though.

though I'm sure it still poses issues for JITs - same 'prove all indices are in range' problem to eliminate the bounds check. I could see wrapping being an acceptable choice as well, since that's sort of 'native' semantics.

No, again -- unsafe.

Firefox appears to return undefined for out of range typed array elements; can someone comment on whether Spidermonkey is easily able to achieve this and whether it hurts the JIT and/or gathering of type information? Would some other behavior be faster? Chrome appears to do this too currently.

Safari too. Differences abound: using Math.pow(2,32) or above as an index makes an expando in V8 and JSC. JSC allows named expandos (a.foo = 42). Proto-indexed properties shine through for out of bounds indexes.

Working to nail these down to be as efficient and safe as possible...

People really want typed arrays to be fast, so consider that the context for my comments here. :)

Definitely.