Observability of NaN distinctions — is this a concern?

# Kevin Reid (11 years ago)

I noticed Object.is being discussed recently, and this reminded me of a concern in definition of equality predicates: that there is more than one NaN value.

I see that the current draft (March 8, 2013) section 8.1.5 discusses this, but it says that ?to ECMAScript code, all NaN values are indistinguishable from each other.?

Depending on what you mean by ?ECMAScript code?, this may be false given the Typed Arrays extension, which allows direct access to the bit-patterns of float values (the Typed Arrays spec permits, but does not require, replacing a NaN value with any other NaN value on read or write).

In some browsers, namely current Safari and current Chrome (stable, not beta), there are at least two distinct observable patterns (apparently one for the NaN literal and propagation from operations on it, and one for operations on numbers that are undefined).

Is this considered a problem?

Scruffy test case:

<script>
function convert(f) {
  var buf = new ArrayBuffer(8);
  var a = new Float64Array(buf);
  var b = new Uint8Array(buf);
  a[0] = f;
  var s = "";
  for (var i = 7; i >= 0; i--) {
    s += '0'+b[i].toString(16).substr(-2);
  }
  return f + ' ' + s;
}
document.write(convert(0/0) + "<br>");
document.write(convert(Infinity*0) + "<br>");
document.write(convert(Infinity-Infinity) + "<br>");
document.write(convert(Math.pow(-1, 0.2)) + "<br>");
document.write(convert(Math.sqrt(-1)) + "<br>");
document.write(convert(Math.log(-1)) + "<br>");
document.write(convert(NaN) + "<br>");
document.write(convert(NaN*0) + "<br>");
document.write(convert(Math.sqrt(NaN)) + "<br>");
</script>
# Jeff Walden (11 years ago)

Negation on at least some x86-ish systems also produces another kind of NaN, because the trivial negation implementation is a sign-bit flip.

This strikes me as similar to the endianness concerns of typed arrays, except probably far less harmful in practice. I don't see what can reasonably be done about it, without effectively mandating attempting NaN-substitution whenever the value to set might be NaN. But maybe someone smarter has ideas.

# Allen Wirfs-Brock (11 years ago)

On Mar 20, 2013, at 11:08 AM, Jeff Walden wrote:

Negation on at least some x86-ish systems also produces another kind of NaN, because the trivial negation implementation is a sign-bit flip.

This strikes me as similar to the endianness concerns of typed arrays, except probably far less harmful in practice. I don't see what can reasonably be done about it, without effectively mandating attempting NaN-substitution whenever the value to set might be NaN. But maybe someone smarter has ideas.

It simple, all isNaN testing and all places that perform "pointer equivalent" tests must treat all NaN values as equivalent. An implementation either must guarantee normalization or NaNs or explicitly check for non identical NaN bit patterns.

# David Bruant (11 years ago)

Le 20/03/2013 19:08, Jeff Walden a écrit :

I don't see what can reasonably be done about it, without effectively mandating attempting NaN-substitution whenever the value to set might be NaN. +1. I think Safari needs to be fixed. Stuffing info in NaNs sounds superfluous. JS has objects, arrays, Typed Arrays, strings. Largely enough legitimate mechanisms to represent informations without having to add a hacky one. That would be sending a wrong signal to developers in my opinion.

# Kevin Reid (11 years ago)

On Wed, Mar 20, 2013 at 12:10 PM, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:

On Mar 20, 2013, at 10:09 AM, Kevin Reid wrote:

Depending on what you mean by “ECMAScript code”, this may be false given the Typed Arrays extension, which allows direct access to the bit-patterns of float values (the Typed Arrays spec permits, but does not require, replacing a NaN value with any other NaN value on read or write).

This is not how it is specified in the ES6 spec. See 15.13.5.1.3 steps 7.b & 8.b and 15.13.5.1.4 steps 7.a & 8.a. Normalization of NaN values is required on retrieval and permitted on stores form/to ArrayBuffers.

I see. I was reading the Khronos version and hadn't realized it was included in ES6.

The ES spec. requirement (which isn't new to ES6) still applies. If they

expose observably different NaN values to any ECMAScript code they aren't conforming to the spec.

Then it seems to me that the wording of the spec, while not self-contradictory, makes it unnecessarily unobvious how to correctly implement it. Consider these two cases (which I think are exhaustive):

  1. The implementation uses exactly one bit pattern for JS values which are NaN. In this case, normalization is required on reads and is a no-op for writes.

  2. The implementation represents JS values which are NaN using arbitrary NaN bit patterns (and SameValue considers them all equal). In this case, normalization is unnecessary for reads and necessary for writes (else, as my example code shows, the difference is observable which contradicts 8.1.5).

Thus, normalization on write is either a no-op or necessary, so should be mandatory, and normalization on read is unobservable in either case, so need not be mandatory.

# Allen Wirfs-Brock (11 years ago)

On Mar 20, 2013, at 12:34 PM, Kevin Reid wrote:

On Wed, Mar 20, 2013 at 12:10 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote: On Mar 20, 2013, at 10:09 AM, Kevin Reid wrote:

Depending on what you mean by “ECMAScript code”, this may be false given the Typed Arrays extension, which allows direct access to the bit-patterns of float values (the Typed Arrays spec permits, but does not require, replacing a NaN value with any other NaN value on read or write).

This is not how it is specified in the ES6 spec. See 15.13.5.1.3 steps 7.b & 8.b and 15.13.5.1.4 steps 7.a & 8.a. Normalization of NaN values is required on retrieval and permitted on stores form/to ArrayBuffers.

I see. I was reading the Khronos version and hadn't realized it was included in ES6.

The ES spec. requirement (which isn't new to ES6) still applies. If they expose observably different NaN values to any ECMAScript code they aren't conforming to the spec.

Then it seems to me that the wording of the spec, while not self-contradictory, makes it unnecessarily unobvious how to correctly implement it. Consider these two cases (which I think are exhaustive):

  1. The implementation uses exactly one bit pattern for JS values which are NaN. In this case, normalization is required on reads and is a no-op for writes.

  2. The implementation represents JS values which are NaN using arbitrary NaN bit patterns (and SameValue considers them all equal). In this case, normalization is unnecessary for reads and necessary for writes (else, as my example code shows, the difference is observable which contradicts 8.1.5).

Thus, normalization on write is either a no-op or necessary, so should be mandatory, and normalization on read is unobservable in either case, so need not be mandatory.

If you're specifically talking about reading/writing TypedArray elements (really ArrayBuffers) you have to take into account the possibility that you can have different types overlaying the same buffer storage. Hence a NaN bit pattern might be written as 2 Uint32 values and then retrieved as a Float64 value. In that case, there is no Float64 write to perform the normalization so it must be done on all reads. Such normalization is especially important if object pointers are represented using NaN-boxing.

# Brandon Benvie (11 years ago)

On 3/20/2013 12:56 PM, Allen Wirfs-Brock wrote:

If you're specifically talking about reading/writing TypedArray elements (really ArrayBuffers) you have to take into account the possibility that you can have different types overlaying the same buffer storage. Hence a NaN bit pattern might be written as 2 Uint32 values and then retrieved as a Float64 value. In that case, there is no Float64 write to perform the normalization so it must be done on all reads. Such normalization is especially important if object pointers are represented using NaN-boxing.

Allen

Exactly, you can't control it on the write end. Writing two uint32's is precisely how you do NaN-tagging using typed arrays (since you can't write an uint64). The way to normalize is to coerce to a single canonical NaN on read, if anything.

(an example of NaN-tagging using typed arrays: gist.github.com/Benvie/5021724)

# Kevin Reid (11 years ago)

On Wed, Mar 20, 2013 at 12:56 PM, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:

If you're specifically talking about reading/writing TypedArray elements (really ArrayBuffers) you have to take into account the possibility that you can have different types overlaying the same buffer storage.

Yes, that was my original example.

Hence a NaN bit pattern might be written as 2 Uint32 values and then retrieved as a Float64 value. In that case, there is no Float64 write to perform the normalization so it must be done on all reads. Such normalization is especially important if object pointers are represented using NaN-boxing.

That normalization on read is is my case 1 above — it is necessary for that implementation. A conformant implementation could use a different strategy which does not normalize on Float64 read, and this would be unobservable, so the spec should not bother to specify it.

However, lack of normalization on Float64 write is potentially observable (if the implementation does not normalize all NaNs from all sources). Therefore, I argue, the spec should specify that normalization happens on write; and it happens that an implementation can omit that as an explicit step, with no observable difference, if and only if its representation of NaN in JS values (from all possible sources, not just typed arrays) is normalized.

# Allen Wirfs-Brock (11 years ago)

On Mar 20, 2013, at 1:42 PM, Kevin Reid wrote:

That normalization on read is is my case 1 above — it is necessary for that implementation. A conformant implementation could use a different strategy which does not normalize on Float64 read, and this would be unobservable, so the spec should not bother to specify it.

However, lack of normalization on Float64 write is potentially observable (if the implementation does not normalize all NaNs from all sources). Therefore, I argue, the spec should specify that normalization happens on write; and it happens that an implementation can omit that as an explicit step, with no observable difference, if and only if its representation of NaN in JS values (from all possible sources, not just typed arrays) is normalized.

The buffer contents may have come form an external source or the buffer may be accessible for writes by an agent that is not part of the ES implementation. The only thing that the ES implementation has absolute control over are its own reads from a buffer and the values it propagates from those reads.

# Kevin Reid (11 years ago)

On Wed, Mar 20, 2013 at 1:57 PM, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:

On Mar 20, 2013, at 1:42 PM, Kevin Reid wrote:

That normalization on read is is my case 1 above — it is necessary _for

that implementation_. A conformant implementation could use a different strategy which does not normalize on Float64 read, and this would be unobservable, so the spec should not bother to specify it.

However, lack of normalization on Float64 write is potentially observable (if the implementation does not normalize all NaNs from all sources). Therefore, I argue, the spec should specify that normalization happens on write; and it happens that an implementation can omit that as an explicit step, with no observable difference, if and only if its representation of NaN in JS values (from all possible sources, not just typed arrays) is normalized.

The buffer contents may have come form an external source or the buffer may be accessible for writes by an agent that is not part of the ES implementation. The only thing that the ES implementation has absolute control over are its own reads from a buffer and the values it propagates from those reads.

I don't think we're disagreeing about any facts or principles (everything in your paragraph above is true), but you're thinking about implementation strategies and I'm thinking about observable behavior.

This is the important point: normalization on write or observably-equivalent behavior is implicitly mandatory because otherwise 8.1.5 may fail to hold (standard ES code can use standard ES tools to distinguish NaNs, as demonstrated by my test results — the behavior I found does not contradict the spec, to my knowledge). Therefore, the spec should not claim that it is optional.

Incidentally, I observe that normalization on read is not necessary except as an implementation strategy. It may well be that all implementations will find it expedient, but there is no need for the spec to require it, since (as 8.1.5 specifically acknowledges) an implementation may choose to let the NaN bits vary, as long as all operations on them (which includes SetValueInBuffer by my above argument) treat them identically.

# Kenneth Russell (11 years ago)

On Wed, Mar 20, 2013 at 2:24 PM, Kevin Reid <kpreid at google.com> wrote:

On Wed, Mar 20, 2013 at 1:57 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 20, 2013, at 1:42 PM, Kevin Reid wrote:

That normalization on read is is my case 1 above — it is necessary for that implementation. A conformant implementation could use a different strategy which does not normalize on Float64 read, and this would be unobservable, so the spec should not bother to specify it.

However, lack of normalization on Float64 write is potentially observable (if the implementation does not normalize all NaNs from all sources). Therefore, I argue, the spec should specify that normalization happens on write; and it happens that an implementation can omit that as an explicit step, with no observable difference, if and only if its representation of NaN in JS values (from all possible sources, not just typed arrays) is normalized.

The buffer contents may have come form an external source or the buffer may be accessible for writes by an agent that is not part of the ES implementation. The only thing that the ES implementation has absolute control over are its own reads from a buffer and the values it propagates from those reads.

I don't think we're disagreeing about any facts or principles (everything in your paragraph above is true), but you're thinking about implementation strategies and I'm thinking about observable behavior.

This is the important point: normalization on write or observably-equivalent behavior is implicitly mandatory because otherwise 8.1.5 may fail to hold (standard ES code can use standard ES tools to distinguish NaNs, as demonstrated by my test results — the behavior I found does not contradict the spec, to my knowledge). Therefore, the spec should not claim that it is optional.

The typed array specification in its original form deliberately avoided specifying normalization of NaNs upon writes to Float32Array and Float64Array. Doing so has no practical value and only imposes a performance hit, which is unacceptable for applications trying to reach the highest possible performance.

I hope that the ES6 integration of typed arrays will not require normalization of NaNs on write, even if other specification changes need to be made to avoid requiring it.

# Allen Wirfs-Brock (11 years ago)

On Mar 20, 2013, at 2:38 PM, Kenneth Russell wrote:

The typed array specification in its original form deliberately avoided specifying normalization of NaNs upon writes to Float32Array and Float64Array. Doing so has no practical value and only imposes a performance hit, which is unacceptable for applications trying to reach the highest possible performance.

I hope that the ES6 integration of typed arrays will not require normalization of NaNs on write, even if other specification changes need to be made to avoid requiring it.

Here is the exact language that is in the current ES6 draft for storing a Number into a buffer as a Float64 (rawValue is the value that gets stored into the buffer):

Set rawValue to the 8 bytes that are the IEEE-868-2005 binary64 format encoding of value. If isBigEndian is true, the bytes are arranged in big endian order. Otherwise, the bytes are arranged in little endian order. If value is NaN, rawValue is may be set to any implementation choosen non-signaling NaN encoding.

# Brendan Eich (11 years ago)

Kenneth Russell wrote:

I hope that the ES6 integration of typed arrays will not require normalization of NaNs on write, even if other specification changes need to be made to avoid requiring it.

What other specification changes?

JITs use nan-boxing (wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations). If a typed array user could forge a nan-boxed value, they could pwn the JITting VM.

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

Ergo, ES6 must specify normative handling of NaNs

# David Herman (11 years ago)

On Mar 22, 2013, at 7:47 PM, Brendan Eich <brendan at mozilla.com> wrote:

Kenneth Russell wrote:

I hope that the ES6 integration of typed arrays will not require normalization of NaNs on write, even if other specification changes need to be made to avoid requiring it.

What other specification changes?

Ken means that we should change the part of the specification that states the invariant about multiple representations of NaN being unobservable, because he wants typed arrays to be allowed to break that invariant.

I disagree with Ken.

JITs use nan-boxing (wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations). If a typed array user could forge a nan-boxed value, they could pwn the JITting VM.

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

IMO this latter point is the most important: regardless of NaN-boxing, allowing different engines to non-deterministically produce one of several different bit patterns when writing a NaN into a typed array is under-specification and asking for portability bugs.

Not only that, but I believe any time a web API designer wants to break a long-standing invariant of JS, particularly one that is stated explicitly in ECMAScript, the onus is on them to argue that they have a right to break that invariant.

Ergo, ES6 must specify normative handling of NaNs

Agreed.

# David Herman (11 years ago)

On Mar 20, 2013, at 1:57 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 20, 2013, at 1:42 PM, Kevin Reid wrote:

That normalization on read is is my case 1 above — it is necessary for that implementation. A conformant implementation could use a different strategy which does not normalize on Float64 read, and this would be unobservable, so the spec should not bother to specify it.

However, lack of normalization on Float64 write is potentially observable (if the implementation does not normalize all NaNs from all sources). Therefore, I argue, the spec should specify that normalization happens on write; and it happens that an implementation can omit that as an explicit step, with no observable difference, if and only if its representation of NaN in JS values (from all possible sources, not just typed arrays) is normalized.

The buffer contents may have come form an external source or the buffer may be accessible for writes by an agent that is not part of the ES implementation. The only thing that the ES implementation has absolute control over are its own reads from a buffer and the values it propagates from those reads.

I think you misunderstand what we mean here by write normalization. The goal here is not to ensure that specific bit patterns never appear in an ArrayBuffer. All possible bit patterns are and should be legal. Rather, the goal is to fully and deterministically specify that when the ES engine writes a NaN value into a Float32Array or Float64Array, a single canonical bit pattern is written into the ArrayBuffer. That is what ensures that it's impossible to distinguish different NaNs.

(As for reads, the only thing the spec needs to say is that all possible NaN patterns are read as the JS NaN value. The particular bit representation of the JS NaN value itself is an implementation-specific thing and doesn't need any particular speccing.)

# David Herman (11 years ago)

On Mar 20, 2013, at 2:47 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

Here is the exact language that is in the current ES6 draft for storing a Number into a buffer as a Float64 (rawValue is the value that gets stored into the buffer):

Set rawValue to the 8 bytes that are the IEEE-868-2005 binary64 format encoding of value. If isBigEndian is true, the bytes are arranged in big endian order. Otherwise, the bytes are arranged in little endian order. If value is NaN, rawValue is may be set to any implementation choosen non-signaling NaN encoding.

This is what Ken is saying he prefers (for performance), but it violates the stated language invariant of section 8.1.5.

Notice that it's exactly the language "rawValue is may (sic) be set to any implementation choosen (sic) non-signaling NaN encoding" that is non-deterministic. It doesn't actually say what bit pattern is produced, and the program may subsequently observe which bit pattern it was. That's what breaks the invariant.

# Kenneth Russell (11 years ago)

On Fri, Mar 22, 2013 at 7:47 PM, Brendan Eich <brendan at mozilla.com> wrote:

Kenneth Russell wrote:

I hope that the ES6 integration of typed arrays will not require normalization of NaNs on write, even if other specification changes need to be made to avoid requiring it.

What other specification changes?

JITs use nan-boxing (wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations). If a typed array user could forge a nan-boxed value, they could pwn the JITting VM.

As has been pointed out before, such JavaScript implementations must already normalize NaN values read from Float32Array and Float64Array. This is sufficient to protect against forging of NaN-boxed values.

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

As long as NaN values loaded from Float32Array and Float64Array obey ES6's semantics, such as returning true from isNaN(), then it should not matter whether the ES6 implementation happens to support multiple bit patterns for NaN. NaNs are an error condition and corner case in floating-point algorithms, and it is important not to negatively impact the performance of the common case in order to achieve absolutely precise semantics for this error case.

# Kenneth Russell (11 years ago)

On Fri, Mar 22, 2013 at 10:34 PM, David Herman <dherman at mozilla.com> wrote:

On Mar 22, 2013, at 7:47 PM, Brendan Eich <brendan at mozilla.com> wrote:

Kenneth Russell wrote:

I hope that the ES6 integration of typed arrays will not require normalization of NaNs on write, even if other specification changes need to be made to avoid requiring it.

What other specification changes?

Ken means that we should change the part of the specification that states the invariant about multiple representations of NaN being unobservable, because he wants typed arrays to be allowed to break that invariant.

To be clear, my intent is to eliminate possible performance bottlenecks that would prevent ES6 from efficiently implementing numerical algorithms.

This was a design goal of the typed array specification from day one, and one of the reasons the specification has achieved a measure of success. It would be short-sighted to discard this property without careful consideration while integrating typed arrays into the ES6 specification.

I disagree with Ken.

JITs use nan-boxing (wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations). If a typed array user could forge a nan-boxed value, they could pwn the JITting VM.

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

IMO this latter point is the most important: regardless of NaN-boxing, allowing different engines to non-deterministically produce one of several different bit patterns when writing a NaN into a typed array is under-specification and asking for portability bugs.

Realistically, no developer is going to store a NaN into a Float32Array or Float64Array and expect that a certain set of bytes were produced, verifiable using another view like Uint8Array. This concern was raised by hixie in the context of WebSockets some time ago, and I deliberately resisted making changes to the typed array specification in order to avoid adding a mandatory test-and-branch, or conditional move, in the setter of Float32Array and Float64Array. In the interim time, this issue has never been raised as a portability concern by real world developers.

Not only that, but I believe any time a web API designer wants to break a long-standing invariant of JS, particularly one that is stated explicitly in ECMAScript, the onus is on them to argue that they have a right to break that invariant.

This issue shouldn't be decided by someone arguing over the "right" to break an ES6 invariant. It should be decided by creating a benchmark of storing large amounts of data into a Float32Array or Float64Array, and running the benchmark on two versions of the same JIT, one with NaN canonicalization on write, and one without. If the performance loss from NaN canonicalization on write is large, then in my opinion, the ES6 spec should be changed to not require it.

# Brendan Eich (11 years ago)

Kenneth Russell wrote:

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

As long as NaN values loaded from Float32Array and Float64Array obey ES6's semantics, such as returning true from isNaN(), then it should not matter whether the ES6 implementation happens to support multiple bit patterns for NaN.

That's at issue. By reading via an aliasing byte or wider integral-typed array view, different results could be observed on two VMs for the exact same code.

NaNs are an error condition and corner case in floating-point algorithms, and it is important not to negatively impact the performance of the common case in order to achieve absolutely precise semantics for this error case.

I want "as close to the metal as safety and interop allow" performance. I'm not sure we are actually in conflict here, given how aggressive JITs work.

Have you measured the performance impact in any optimizing VM?

# Kenneth Russell (11 years ago)

On Mon, Mar 25, 2013 at 2:40 PM, Brendan Eich <brendan at mozilla.com> wrote:

Kenneth Russell wrote:

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

As long as NaN values loaded from Float32Array and Float64Array obey ES6's semantics, such as returning true from isNaN(), then it should not matter whether the ES6 implementation happens to support multiple bit patterns for NaN.

That's at issue. By reading via an aliasing byte or wider integral-typed array view, different results could be observed on two VMs for the exact same code.

NaNs are an error condition and corner case in floating-point algorithms, and it is important not to negatively impact the performance of the common case in order to achieve absolutely precise semantics for this error case.

I want "as close to the metal as safety and interop allow" performance. I'm not sure we are actually in conflict here, given how aggressive JITs work.

Have you measured the performance impact in any optimizing VM?

No. I won't have time to do so myself, but would be interested in working with anybody who does to measure the performance difference.

# Allen Wirfs-Brock (11 years ago)

On Mar 25, 2013, at 2:40 PM, Brendan Eich wrote:

Kenneth Russell wrote:

For interop, JS requires cross-browser (VM) NaN canonicalization to avoid observably different results on different browsers.

As long as NaN values loaded from Float32Array and Float64Array obey ES6's semantics, such as returning true from isNaN(), then it should not matter whether the ES6 implementation happens to support multiple bit patterns for NaN.

That's at issue. By reading via an aliasing byte or wider integral-typed array view, different results could be observed on two VMs for the exact same code.

Note that the ES5.1 spec. says: "In some implementations, external code might be able to detect a difference between various Not-a-Number values, but such behaviour is implementation-dependent; to ECMAScript code, all NaN values are indistinguishable from each other."

This language (with only minor editorial variation) has been in the spec. since ES1. It presumably means, among other things, that host functions might observe different NaN encodings being passed to them as arguments (ie, ES has never required NaN cannonicalization when passing number values to non-ES functions). Presumably such a function could write such a parameter value into a Float64Array element that is return to ES code where it is observed via a Uint8Array overlay. Has this ever caused a problem that anybody is aware of?

BTW, isn't cannonicalization of endian-ness for both integers and floats a bigger interop issue than NaN cannonicalization? I know this was discussed in the past, but it doesn't seem to be covered in the latest Khronos spec. Was there ever a resolution as to whether or not TypedArray [[Set]] operations need to use a cannonical endian-ness?

# Brendan Eich (11 years ago)

Allen Wirfs-Brock wrote:

BTW, isn't cannonicalization of endian-ness for both integers and floats a bigger interop issue than NaN cannonicalization? I know this was discussed in the past, but it doesn't seem to be covered in the latest Khronos spec. Was there ever a resolution as to whether or not TypedArray [[Set]] operations need to use a cannonical endian-ness?

Search for "byte order" at www.khronos.org/registry/typedarray/specs/latest.

# Allen Wirfs-Brock (11 years ago)

On Mar 25, 2013, at 4:05 PM, Brendan Eich wrote:

Allen Wirfs-Brock wrote:

BTW, isn't cannonicalization of endian-ness for both integers and floats a bigger interop issue than NaN cannonicalization? I know this was discussed in the past, but it doesn't seem to be covered in the latest Khronos spec. Was there ever a resolution as to whether or not TypedArray [[Set]] operations need to use a cannonical endian-ness?

Search for "byte order" at www.khronos.org/registry/typedarray/specs/latest.

I had already search for "endian" with similar results. It says that the default for DataViews gets/sets that do not specify a byte order is big-endean. It doesn't say anything (that I can find) about such accesses on TypedArray gets/sets.

# Brendan Eich (11 years ago)

Allen Wirfs-Brock wrote:

On Mar 25, 2013, at 4:05 PM, Brendan Eich wrote:

Allen Wirfs-Brock wrote:

BTW, isn't cannonicalization of endian-ness for both integers and floats a bigger interop issue than NaN cannonicalization? I know this was discussed in the past, but it doesn't seem to be covered in the latest Khronos spec. Was there ever a resolution as to whether or not TypedArray [[Set]] operations need to use a cannonical endian-ness? Search for "byte order" at www.khronos.org/registry/typedarray/specs/latest.

I had already search for "endian" with similar results. It says that the default for DataViews gets/sets that do not specify a byte order is big-endean. It doesn't say anything (that I can find) about such accesses on TypedArray gets/sets.

Oh, odd -- I recall that it used to say little-endian. Typed arrays are LE to match dominant architectures, while DataViews are BE to match packed serialization use-cases.

Ken, did something get edited out?

# Kenneth Russell (11 years ago)

On Mon, Mar 25, 2013 at 4:23 PM, Brendan Eich <brendan at mozilla.com> wrote:

Allen Wirfs-Brock wrote:

On Mar 25, 2013, at 4:05 PM, Brendan Eich wrote:

Allen Wirfs-Brock wrote:

BTW, isn't cannonicalization of endian-ness for both integers and floats a bigger interop issue than NaN cannonicalization? I know this was discussed in the past, but it doesn't seem to be covered in the latest Khronos spec. Was there ever a resolution as to whether or not TypedArray [[Set]] operations need to use a cannonical endian-ness?

Search for "byte order" at www.khronos.org/registry/typedarray/specs/latest.

I had already search for "endian" with similar results. It says that the default for DataViews gets/sets that do not specify a byte order is big-endean. It doesn't say anything (that I can find) about such accesses on TypedArray gets/sets.

Oh, odd -- I recall that it used to say little-endian. Typed arrays are LE to match dominant architectures, while DataViews are BE to match packed serialization use-cases.

Ken, did something get edited out?

No. The typed array views (everything except DataView) have used the host machine's endianness from day one by design -- although the typed array spec does not state this explicitly. If desired, text can be added to the specification to this effect. Any change in this behavior will destroy the performance of APIs like WebGL and Web Audio on big-endian architectures.

Correctly written code works identically on big-endian and little-endian architectures. See www.html5rocks.com/en/tutorials/webgl/typed_arrays for a detailed description of the usage of the APIs.

DataView, which is designed for input/output, operates on data with a specified endianness.

# Brendan Eich (11 years ago)

Right, thanks for the reminder. It all comes back now, including the "how to write correct ending-independent typed array code" bit.

# Allen Wirfs-Brock (11 years ago)

On Mar 25, 2013, at 6:00 PM, Brendan Eich wrote:

Right, thanks for the reminder. It all comes back now, including the "how to write correct ending-independent typed array code" bit.

Ok, so looping back to my earlier observation. It sounds like endian-ness can be observed by writing into an Float64Array element and then reading back from a Uint8Array that is backed by the same buffer. If there is agreement that this doesn't represent a significant interoperability hazard can we also agree that not doing NaN cannonicalization on writes to FloatXArray is an even less significant hazard and need not be mandated?

# Bjoern Hoehrmann (11 years ago)
  • Kenneth Russell wrote:

No. The typed array views (everything except DataView) have used the host machine's endianness from day one by design -- although the typed array spec does not state this explicitly. If desired, text can be added to the specification to this effect.

That seems to be called for.

Thanks,

# Oliver Hunt (11 years ago)

On Mar 26, 2013, at 2:35 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 25, 2013, at 6:00 PM, Brendan Eich wrote:

Right, thanks for the reminder. It all comes back now, including the "how to write correct ending-independent typed array code" bit.

Ok, so looping back to my earlier observation. It sounds like endian-ness can be observed by writing into an Float64Array element and then reading back from a Uint8Array that is backed by the same buffer. If there is agreement that this doesn't represent a significant interoperability hazard can we also agree that not doing NaN cannonicalization on writes to FloatXArray is an even less significant hazard and need not be mandated?

The reason I have pushed for NaN canonicalization is because it means that it allows (and in essence requires) that typedArray[n] = typedArray[n] can modify the bit value of typedArray[n].

An implementation may be able to optimize the checks away in some cases, but most engines must perform checks on read unless they can prove that they were the original source of the value being read.

Forcing canonicalization simply means that you are guaranteed that a certain behavior will occur, and so won't be bitten by some tests changing behavior that you may have seen during testing. I know Ken hates these sorts of things, but seriously in the absence of a real benchmark, that shows catastrophic performance degradation due to this, simply saying "this is extra work that will burn cpu cycles" without evidence is a waste of time. Also there is absolutely no case in which abstract performance concerns should ever outweigh absolute security and correctness bugs.

We need to stop raising "this causes performance problems" type issues without a concrete example of that problem. I remember having to work very hard to stop WebGL from being a gaping security hole in the first place and it's disappointing to see these same issues being re-raised in a different forum to try and get them bypassed here.

# Jussi Kalliokoski (11 years ago)

On Tue, Mar 26, 2013 at 4:16 AM, Oliver Hunt <oliver at apple.com> wrote:

On Mar 26, 2013, at 2:35 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 25, 2013, at 6:00 PM, Brendan Eich wrote:

Right, thanks for the reminder. It all comes back now, including the "how to write correct ending-independent typed array code" bit.

Ok, so looping back to my earlier observation. It sounds like endian-ness can be observed by writing into an Float64Array element and then reading back from a Uint8Array that is backed by the same buffer. If there is agreement that this doesn't represent a significant interoperability hazard can we also agree that not doing NaN cannonicalization on writes to FloatXArray is an even less significant hazard and need not be mandated?

The reason I have pushed for NaN canonicalization is because it means that it allows (and in essence requires) that typedArray[n] = typedArray[n] can modify the bit value of typedArray[n].

An implementation may be able to optimize the checks away in some cases, but most engines must perform checks on read unless they can prove that they were the original source of the value being read.

Forcing canonicalization simply means that you are guaranteed that a certain behavior will occur, and so won't be bitten by some tests changing behavior that you may have seen during testing. I know Ken hates these sorts of things, but seriously in the absence of a real benchmark, that shows catastrophic performance degradation due to this, simply saying "this is extra work that will burn cpu cycles" without evidence is a waste of time.

I also disagree with you.

Also there is absolutely no case in which abstract performance concerns should ever outweigh absolute security and correctness bugs.

Could you elaborate on the security part? I doubt NaN distinctions can really be of any significant use for fingerprinting, etc.

So far I've yet to come across any unexpected bugs caused by this, maybe you have examples? NaN is usually a non-desired value so if you write a NaN you probably had a bug in the first place.

And about correctness, by definition NaN is a category, not a value; by definition a NaN value is not the same as another NaN value. If you want to canonicalize NaN, my suggestion is IEEE, not ES-discuss. ;)

We need to stop raising "this causes performance problems" type issues without a concrete example of that problem. I remember having to work very hard to stop WebGL from being a gaping security hole in the first place and it's disappointing to see these same issues being re-raised in a different forum to try and get them bypassed here.

Before saying security hole, please elaborate. Also, when it comes to standards, I think change should be justified with data, rather than the other way around.

# Mark S. Miller (11 years ago)

On Tue, Mar 26, 2013 at 6:40 AM, Jussi Kalliokoski < jussi.kalliokoski at gmail.com> wrote:

On Tue, Mar 26, 2013 at 4:16 AM, Oliver Hunt <oliver at apple.com> wrote:

On Mar 26, 2013, at 2:35 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 25, 2013, at 6:00 PM, Brendan Eich wrote:

Right, thanks for the reminder. It all comes back now, including the "how to write correct ending-independent typed array code" bit.

Ok, so looping back to my earlier observation. It sounds like endian-ness can be observed by writing into an Float64Array element and then reading back from a Uint8Array that is backed by the same buffer. If there is agreement that this doesn't represent a significant interoperability hazard can we also agree that not doing NaN cannonicalization on writes to FloatXArray is an even less significant hazard and need not be mandated?

The reason I have pushed for NaN canonicalization is because it means that it allows (and in essence requires) that typedArray[n] = typedArray[n] can modify the bit value of typedArray[n].

An implementation may be able to optimize the checks away in some cases, but most engines must perform checks on read unless they can prove that they were the original source of the value being read.

Forcing canonicalization simply means that you are guaranteed that a certain behavior will occur, and so won't be bitten by some tests changing behavior that you may have seen during testing. I know Ken hates these sorts of things, but seriously in the absence of a real benchmark, that shows catastrophic performance degradation due to this, simply saying "this is extra work that will burn cpu cycles" without evidence is a waste of time.

I also disagree with you.

Also there is absolutely no case in which abstract performance concerns should ever outweigh absolute security and correctness bugs.

Could you elaborate on the security part? I doubt NaN distinctions can really be of any significant use for fingerprinting, etc.

In SES, Alice says:

var bob = confinedEval(bobSrc);

var carol = confinedEval(carolSrc);

// At this point, bob and carol should be unable to communicate with
// each other, and are in fact completely isolated from each other
// except that Alice holds a reference to both.
// See <http://www.youtube.com/watch?v=w9hHHvhZ_HY> start
// at about 44 minutes in.

var shouldBeImmutable = Object.freeze(Object.create(null, {foo: {value:

NaN}}));

bob(shouldBeImmutable);

carol(shouldBeImmutable);

// Alice, by sharing this object with bob and carol, should still be

able // to assume that they are isolated from each other

Bob says:

var FunnyNaN = // expression creating NaN with non-canonical internal

rep // on this platform, perhaps created by doing funny typed array tricks

if (wantToCommunicate1bitToCarol) {
  Object.defineProperty(shouldBeImmutable, 'foo', {value: FunnyNaN});

// The [[DefineProperty]] algorithm is allowed to overwrite

shouldBeImmutable.foo // with FunnyNaN, since it passes the SameValue check.

Carol says:

if (isNaNFunny(shouldBeImmutable.foo)) {
// where isNaNFunny uses typed array tricks to detect whether its

argument has // a non-canonical rep on this this platform

So far I've yet to come across any unexpected bugs caused by this, maybe you have examples? NaN is usually a non-desired value so if you write a NaN you probably had a bug in the first place.

And about correctness, by definition NaN is a category, not a value; by definition a NaN value is not the same as another NaN value. If you want to canonicalize NaN, my suggestion is IEEE, not ES-discuss. ;)

You're confusing IEEE NaN with ES NaN. In ES, NaN is a value, not a bit pattern. In IEEE, NaN is a family of bit patterns. Type arrays make us face the issue of what IEEE NaN bit pattern an ES NaN value converts to.

We need to stop raising "this causes performance problems" type issues without a concrete example of that problem. I remember having to work very hard to stop WebGL from being a gaping security hole in the first place and it's disappointing to see these same issues being re-raised in a different forum to try and get them bypassed here.

Before saying security hole, please elaborate. Also, when it comes to standards, I think change should be justified with data, rather than the other way around.

Done.

# Jussi Kalliokoski (11 years ago)

On Tue, Mar 26, 2013 at 9:54 AM, Mark S. Miller <erights at google.com> wrote:

On Tue, Mar 26, 2013 at 6:40 AM, Jussi Kalliokoski < jussi.kalliokoski at gmail.com> wrote:

On Tue, Mar 26, 2013 at 4:16 AM, Oliver Hunt <oliver at apple.com> wrote:

On Mar 26, 2013, at 2:35 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Mar 25, 2013, at 6:00 PM, Brendan Eich wrote:

Right, thanks for the reminder. It all comes back now, including the "how to write correct ending-independent typed array code" bit.

Ok, so looping back to my earlier observation. It sounds like endian-ness can be observed by writing into an Float64Array element and then reading back from a Uint8Array that is backed by the same buffer. If there is agreement that this doesn't represent a significant interoperability hazard can we also agree that not doing NaN cannonicalization on writes to FloatXArray is an even less significant hazard and need not be mandated?

The reason I have pushed for NaN canonicalization is because it means that it allows (and in essence requires) that typedArray[n] = typedArray[n] can modify the bit value of typedArray[n].

An implementation may be able to optimize the checks away in some cases, but most engines must perform checks on read unless they can prove that they were the original source of the value being read.

Forcing canonicalization simply means that you are guaranteed that a certain behavior will occur, and so won't be bitten by some tests changing behavior that you may have seen during testing. I know Ken hates these sorts of things, but seriously in the absence of a real benchmark, that shows catastrophic performance degradation due to this, simply saying "this is extra work that will burn cpu cycles" without evidence is a waste of time.

I also disagree with you.

Also there is absolutely no case in which abstract performance concerns should ever outweigh absolute security and correctness bugs.

Could you elaborate on the security part? I doubt NaN distinctions can really be of any significant use for fingerprinting, etc.

In SES, Alice says:

var bob = confinedEval(bobSrc);

var carol = confinedEval(carolSrc);

// At this point, bob and carol should be unable to communicate with
// each other, and are in fact completely isolated from each other
// except that Alice holds a reference to both.
// See <http://www.youtube.com/watch?v=w9hHHvhZ_HY> start
// at about 44 minutes in.

var shouldBeImmutable = Object.freeze(Object.create(null, {foo:

{value: NaN}}));

bob(shouldBeImmutable);

carol(shouldBeImmutable);

// Alice, by sharing this object with bob and carol, should still be

able // to assume that they are isolated from each other

Bob says:

var FunnyNaN = // expression creating NaN with non-canonical internal

rep // on this platform, perhaps created by doing funny typed array tricks

if (wantToCommunicate1bitToCarol) {
  Object.defineProperty(shouldBeImmutable, 'foo', {value: FunnyNaN});

// The [[DefineProperty]] algorithm is allowed to overwrite

shouldBeImmutable.foo // with FunnyNaN, since it passes the SameValue check.

Carol says:

if (isNaNFunny(shouldBeImmutable.foo)) {
// where isNaNFunny uses typed array tricks to detect whether its

argument has // a non-canonical rep on this this platform

The NaN distinction is only observable in the byte array, not if you extract the value, because at that point it becomes an ES NaN value, so that example is invalid.

So far I've yet to come across any unexpected bugs caused by this, maybe you have examples? NaN is usually a non-desired value so if you write a NaN you probably had a bug in the first place.

And about correctness, by definition NaN is a category, not a value; by definition a NaN value is not the same as another NaN value. If you want to canonicalize NaN, my suggestion is IEEE, not ES-discuss. ;)

You're confusing IEEE NaN with ES NaN. In ES, NaN is a value, not a bit pattern. In IEEE, NaN is a family of bit patterns. Type arrays make us face the issue of what IEEE NaN bit pattern an ES NaN value converts to.

That's just because ES has had no notion of bits for floating points before. Other than that, ES NaN works like IEEE NaN, e.g.

0/0 === NaN // false isNaN(0/0) // true

We need to stop raising "this causes performance problems" type issues without a concrete example of that problem. I remember having to work very hard to stop WebGL from being a gaping security hole in the first place and it's disappointing to see these same issues being re-raised in a different forum to try and get them bypassed here.

Before saying security hole, please elaborate. Also, when it comes to standards, I think change should be justified with data, rather than the other way around.

Done.

You'll have to do better than that. ;)

# Oliver Hunt (11 years ago)

On Mar 26, 2013, at 9:12 PM, Jussi Kalliokoski <jussi.kalliokoski at gmail.com> wrote:

That's just because ES has had no notion of bits for floating points before. Other than that, ES NaN works like IEEE NaN, e.g.

0/0 === NaN // false isNaN(0/0) // true

That's true in any language - comparing to NaN is almost always defined explicitly as producing false. You're not looking at bit patterns, here so conflating NaN compares with bit values is kind of pointless.

We need to stop raising "this causes performance problems" type issues without a concrete example of that problem. I remember having to work very hard to stop WebGL from being a gaping security hole in the first place and it's disappointing to see these same issues being re-raised in a different forum to try and get them bypassed here.

Before saying security hole, please elaborate. Also, when it comes to standards, I think change should be justified with data, rather than the other way around.

Done.

You'll have to do better than that. ;)

Ok, I'll try to go over this again, because for whatever reason it doesn't appear to stick:

If you have a double-typed array, and access a member: typedArray[0]

Then in ES it is a double that can be one of these values: +Infinitity, -Infinity, NaN, or a discrete value representable in IEEE double spec. There are no signaling NaNs, nor is there any exposure of what the underlying bit pattern of the NaN is.

So the runtime loads this double, and then stores it somewhere, anywhere, it doesn't matter where, eg. var tmp = typedArray[0];

Now you store it: typedArray[whatever] = tmp;

The specification must allow a bitwise comparison of typedArray[whatever] to typedArray[0] to return false, as it is not possible for any NaN-boxing engine to maintain the bit equality that you would otherwise desire, as that would be trivially exploitable. When I say security and correctness i mean it in the "can't be remotely pwned" sense.

Given that we can't guarantee that the bit pattern will remain unchanged the spec should mandate normalizing to the non-signalling NaN.

# Andreas Rossberg (11 years ago)

I'm with Dave and Oliver. Worrying about the performance cost of NaN normalisation has a strong smell of premature micro optimisation. The one branch per float store, whose cost should be amortised by branch prediction when frequent, seems very unlikely to be measurable compared to everything else going on when executing JavaScript.

As for the "this is not a common problem" argument, use cases should only trump principles if the cost is unbearable.

# Hudson, Rick (11 years ago)

<jussi.kalliokoski at gmail.com> wrote:

That's just because ES has had no notion of bits for floating points before. Other than that, ES NaN works like IEEE NaN, e.g.

Some background about the latest GLS spec and IEEE NaNs from www.opengl.org/registry/doc/GLSLangSpec.4.20.11.clean.pdf

Section 4.1.4 states "As an input value to one of the processing units, a single-precision or double-precision floating-point variable is expected to match the corresponding IEEE 754 floating-point definition for precision and dynamic range. Floating-point variables within a shader are also encoded according to the IEEE 754 specification for single-precision floating-point values (logically, not necessarily physically)."

Section 4.7.1 goes on to provide a detailed list of reduced (from IEEE 754) precision requirements for single precision floats and then states the following: "The precision of double-precision operations is at least that of single precision."

4.7.1 also sets up the requirements of operations to produce NaNs as follows: " NaNs are not required to be generated. Support for signaling NaNs is not required and exceptions are never raised. Operations and built-in functions that operate on a NaN are not required to return a NaN as the result."

So any statement that the GLS spec follows IEEE 754 treatment of double precision is only a statement about input values.

# Sam Tobin-Hochstadt (11 years ago)

On Tue, Mar 26, 2013 at 3:54 AM, Mark S. Miller <erights at google.com> wrote:

// The [[DefineProperty]] algorithm is allowed to overwrite

shouldBeImmutable.foo

While I agree with you on the larger point (and this issue reoccurs with proxies), the actual problem here is that [[DefineProperty]] should just do nothing on immutable data. The idea of mutating immutable data as long as its the same value is crazy.

# Brandon Benvie (11 years ago)

On 3/26/2013 1:12 AM, Jussi Kalliokoski wrote:

The NaN distinction is only observable in the byte array, not if you extract the value, because at that point it becomes an ES NaN value, so that example is invalid.

It becomes observable on the read end by doing:

 float64array[0] = shouldBeImmutable.foo;
 new Uint32Array(float64array.buffer)[0]; // or [1] depending on 

endianness

...unless you canonicalize the NaN on either on the read or the write. This is pretty damning.

# Brandon Benvie (11 years ago)

On 3/26/2013 10:03 AM, Brandon Benvie wrote:

On 3/26/2013 1:12 AM, Jussi Kalliokoski wrote:

The NaN distinction is only observable in the byte array, not if you extract the value, because at that point it becomes an ES NaN value, so that example is invalid. It becomes observable on the read end by doing:

float64array[0] = shouldBeImmutable.foo;
new Uint32Array(float64array.buffer)[0]; // or [1] depending on 

endianness

...unless you canonicalize the NaN on either on the read or the write. This is pretty damning.

It does appear that in practice (at least in V8 and SpiderMonkey), the attempt to write a different NaN does not succeed. It still is possible to use a packed NaN to deliver information as a kind of secret communication channel. Demonstration:

var buff = new ArrayBuffer(8); var f64 = new Float64Array(buff); var ui32 = new Uint32Array(buff); ui32[1] = 0xfff80000; // may need to be 1 depending on endianness

function packNaN(value){ ui32[1] = value; return f64[1]; // may need to be 0 depending on endianness }

function unpackNaN(nan){ f64[0] = nan; return ui32[0]; // may need to be 1 depending on endianness }

var shouldBeImmutable = Object.freeze({ foo: packNaN(1500) }); console.log(unpackNaN(shouldBeImmutable.foo)); // 1500 // the following silently fails in V8 and throws in SpiderMonkey Object.defineProperty(shouldBeImmutable, 'foo', { value: packNaN(2000) });

# Brandon Benvie (11 years ago)

On 3/26/2013 10:39 AM, Brandon Benvie wrote:

On 3/26/2013 10:03 AM, Brandon Benvie wrote:

On 3/26/2013 1:12 AM, Jussi Kalliokoski wrote:

The NaN distinction is only observable in the byte array, not if you extract the value, because at that point it becomes an ES NaN value, so that example is invalid. It becomes observable on the read end by doing:

float64array[0] = shouldBeImmutable.foo;
new Uint32Array(float64array.buffer)[0]; // or [1] depending on 

endianness

...unless you canonicalize the NaN on either on the read or the write. This is pretty damning. It does appear that in practice (at least in V8 and SpiderMonkey), the attempt to write a different NaN does not succeed. It still is possible to use a packed NaN to deliver information as a kind of secret communication channel. Demonstration: ... snip code ...

Sorry, I screwed this up here trying to figure out why NaN packing is not working in Spidermonkey but is in V8. Curiously different results.

var buff = new ArrayBuffer(8); var f64 = new Float64Array(buff); var ui32 = new Uint32Array(buff); ui32[1] = 0xfff80000; // may need to be 0 depending on endianness

function packNaN(value){ ui32[0] = value; // may need to be 1 depending on endianness return f64[0]; }

function unpackNaN(nan){ f64[0] = nan; return ui32[0]; // may need to be 1 depending on endianness }

var shouldBeImmutable = Object.freeze({ foo: packNaN(1500) }); console.log(shouldBeImmutable.foo) // NaN console.log(unpackNaN(shouldBeImmutable.foo)); // 1500 in V8, 0 in SM Object.defineProperty(shouldBeImmutable, 'foo', { value: packNaN(2000) }); console.log(unpackNaN(shouldBeImmutable.foo)); // 1500 in V8, 0 in SM

# Brendan Eich (11 years ago)

No mystery here: SpiderMonkey canonicalizes all NaN bit patterns to 0x7FF8000000000000LL.

# Brendan Eich (11 years ago)

Oliver Hunt wrote:

Given that we can't guarantee that the bit pattern will remain unchanged the spec should mandate normalizing to the non-signalling NaN.

This is where Kenneth parts company, arguing for safety (no pwnage, no ability to forge a NaN in engines that nanbox) but against interoperability on this edge case. The two are separable concerns: safety and interop.

# Allen Wirfs-Brock (11 years ago)

On Mar 26, 2013, at 1:29 AM, Oliver Hunt wrote:

Ok, I'll try to go over this again, because for whatever reason it doesn't appear to stick:

If you have a double-typed array, and access a member: typedArray[0]

Then in ES it is a double that can be one of these values: +Infinitity, -Infinity, NaN, or a discrete value representable in IEEE double spec. There are no signaling NaNs, nor is there any exposure of what the underlying bit pattern of the NaN is.

So the runtime loads this double, and then stores it somewhere, anywhere, it doesn't matter where, eg. var tmp = typedArray[0];

Now you store it: typedArray[whatever] = tmp;

The specification must allow a bitwise comparison of typedArray[whatever] to typedArray[0] to return false, as it is not possible for any NaN-boxing engine to maintain the bit equality that you would otherwise desire, as that would be trivially exploitable. When I say security and correctness i mean it in the "can't be remotely pwned" sense.

Given that we can't guarantee that the bit pattern will remain unchanged the spec should mandate normalizing to the non-signalling NaN.

--Oliver

Oliver,

Let's look at actual ES6 spec. language and see if there is actually anything you would like to see changed. The encoding of NaNs is only discussed in three places:

Section 8.1.5, first paragraph says:

The Number type has exactly 18437736874454810627 (that is, 264253+3) values, representing the double-precision 64-bit format IEEE 754 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic, except that the 9007199254740990 (that is, 2532) distinct “Not-a-Number” values of the IEEE Standard are represented in ECMAScript as a single special NaN value. (Note that the NaN value is produced by the program expression NaN.) In some implementations, external code might be able to detect a difference between various Not-a-Number values, but such behaviour is implementation-dependent; to ECMAScript code, all NaN values are indistinguishable from each other.

This is very old text, most of which dates to ES1. It is defining the ES Number type, which is an abstraction, not an actual bit-level encoding. To me it say these things:

  1. Unlike the the IEEE 64-bit floating point type. The ES Number type has only a single NaN value.
  2. An implementations are free to internally encode the Number type any why it desires (as long is it observably conforms to all the requirements of section 8.1.5).
  3. In particular, an implementation might have multiple internal bit patterns all of which correspond to the single NaN element of the Number type.
  4. A corollary of 3) is that implementation are not required to internally canonicalize NaNs, that is an implementation level design decision.
  5. Implementations are not required to canonicalize NaNs when they are passed or otherwise made visible to non-ES code. Hence such code may be able to observe details of the NaN encoding, including whether or not a canonical NaN value is used internally by the ES implementation.

Is there anything you think should change in the above specification text?

Section 15.13.5.1.3 defines the GetValueFromBuffer abstraction operation which is currently the only place in the ES6 spec. where a ES Number value is retrieved from an ArrayBuffer. Step 8 of the specification algorithm is: 8 If type is “Float64” , then rawValue is interpreted, taking into accont the value of isBigEndian, as a bit string encoding of an IEEE 754-208 binary64 value. If rawValue is any an IEEE 754-208 binary64 NaN value, return the NaN Number value. Return the Number value that is encoded by rawValue. Step 7 is similar but deals with Float32 values.

To me it say these things:

  1. In all cases, an ES Number value (as defined in 8.1.5) is returned.
  2. All IEEE NaN values are logically canonicalized as the single ES NaN value.
  3. No additional requirements for ES Numbers, beyond those in 8.1.5 are introduced. Actual representation remains up to implementation.

Should anything change? If 8.1.5 continues to not mandate any particular or single encoding representation of a NaN, I don't see why this should either.

Section 15.13.5.1.4 defines the SetValueInBuffer abstraction operation which is currently the only place in the ES6 spec. where a ES Number value is stored into a ArrayBuffer. Step 8 of the specification algorithm is:

  1. Else, if type is “Float64” , then Set rawValue to the 8 bytes that are the IEEE-868-2005 binary64 format encoding of value. If isBigEndian is true, the bytes are arranged in big endian order. Otherwise, the bytes are arranged in little endian order. If value is NaN, rawValue is may be set to any implementation choosen non-signaling NaN encoding. Step 7 is similar but deals with Float32 values.

To me it say these things:

  1. When storing into an ArrayBuffer, canonicalization of ES NaN to some particular IEEE NaN is permitted but not required.
  2. The actual ArrayBuffer bit-level representation of NaN values is completely implementation dependent. It need not even be consistent across multiple stores. .
  3. External observers of such a stored value may be able to detect NaN encoding differences by observing Numbers stored into ArrayBuffers. This is allowed by 8.1.5
  4. The actual encoding of a NaN value in an ArrayBuffer is observable by ES code by overly a non-float typed Array on an ArrayBuffer where a Number has been stored using the above steps.

Point 4 seems to be in conflict with the 8.1.5 requirement "to ECMAScript code, all NaN values are indistinguishable from each other.". However, consider that what is being observed is not directly an ES Number NaN value but instead a bit pattern that according to15.13.5.1.4 above may not be a direct reflection of the Number NaN that was stored. So maybe it slips by.

So, again any change requests?

From a spec. perspective we could require canonicalization to some specific IEEE NaN (how do we pick one?) in 15.13.5.4. If we are going to do that, we presumably also should require canonicalization of endian-ness which is also observable and arguably a bigger interop hazard than NaNs. However, I think the performance arguments for not canonicalizing endian-ness in typed array are stronger. If we are going to continue to let endian-ness be implementation determined then I don't think there is much point in worrying about changing the the handling of NaNs in 15.13.5.1.4. But, I'm flexible.

# Dmitry Lomov (11 years ago)

I would like to add some perf numbers to this discussion, if I may.

I have implemented a quick patch for V8 that implements NaN normalization on Float64Array store. V8 does not use NaN boxing, so from implementation perspective NaN normalization is not required by V8.

In my test, the perf penalty for Float64Array store is 25% (on ia32 architecture in optimized regime, using sse2 instructions). I am attaching the test and my patch for your perusal (the patch should easily apply to v8's bleeding_edge).

It feels like the performance hit is a very considerable factor here, especially given that typed arrays are all about fast math.

Results on my machine:

Normalized stores: $ ./out/ia32.release/d8 float-array.js Start 14033 msec Start 14059 msec Start 13983 msec Start 13979 msec Non-normalized stores: $ ./out.baseline/ia32.release/d8 float-array.js Start 11197 msec Start 11207 msec Start 11207 msec Start 11253 msec

Dmitry

# Jeff Walden (11 years ago)

On 03/26/2013 08:48 AM, Sam Tobin-Hochstadt wrote:

While I agree with you on the larger point (and this issue reoccurs with proxies), the actual problem here is that [[DefineProperty]] should just do nothing on immutable data. The idea of mutating immutable data as long as its the same value is crazy.

ValidateAndApplyPropertyDescriptor in ES6 (and its predecessor in ES5) don't mutate if every field in the input descriptor matches the existing fields in the existing descriptor. So a non-writable, non-configurable existing property will either throw because !SameValue on the [[Value]] components, or it'll hit SameValue(nanValue1, nanValue2) and not change anything, per spec. Crazy implementations that try to optimize here might behave differently, of course, but there's no spec issue along these lines.

# Kenneth Russell (11 years ago)

Dmitry, thank you for prototyping and benchmarking this. There should be no question that a slowdown of 25% is too high a cost to pay.

Allen's analysis earlier in the thread indicates that no spec changes are necessary in order to allow multiple bit patterns to be used when storing NaNs into Float32Array and Float64Array. Can this topic be laid to rest?

# Brendan Eich (11 years ago)

Old SGI-hacker-me is on board with spec'ing as-is. Thanks,

# Jussi Kalliokoski (11 years ago)

On Tue, Mar 26, 2013 at 10:29 AM, Oliver Hunt <oliver at apple.com> wrote:

On Mar 26, 2013, at 9:12 PM, Jussi Kalliokoski < jussi.kalliokoski at gmail.com> wrote:

That's just because ES has had no notion of bits for floating points before. Other than that, ES NaN works like IEEE NaN, e.g.

0/0 === NaN // false isNaN(0/0) // true

That's true in any language - comparing to NaN is almost always defined explicitly as producing false. You're not looking at bit patterns, here so conflating NaN compares with bit values is kind of pointless.

We need to stop raising "this causes performance problems" type issues without a concrete example of that problem. I remember having to work very hard to stop WebGL from being a gaping security hole in the first place and it's disappointing to see these same issues being re-raised in a different forum to try and get them bypassed here.

Before saying security hole, please elaborate. Also, when it comes to standards, I think change should be justified with data, rather than the other way around.

Done.

You'll have to do better than that. ;)

Ok, I'll try to go over this again, because for whatever reason it doesn't appear to stick:

If you have a double-typed array, and access a member: typedArray[0]

Then in ES it is a double that can be one of these values: +Infinitity, -Infinity, NaN, or a discrete value representable in IEEE double spec. There are no signaling NaNs, nor is there any exposure of what the underlying bit pattern of the NaN is.

So the runtime loads this double, and then stores it somewhere, anywhere, it doesn't matter where, eg. var tmp = typedArray[0];

Now you store it: typedArray[whatever] = tmp;

The specification must allow a bitwise comparison of typedArray[whatever] to typedArray[0] to return false, as it is not possible for any NaN-boxing engine to maintain the bit equality that you would otherwise desire, as that would be trivially exploitable. When I say security and correctness i mean it in the "can't be remotely pwned" sense.

Given that we can't guarantee that the bit pattern will remain unchanged the spec should mandate normalizing to the non-signalling NaN.

--Oliver

It's not trivially exploitable, at least not in SM or V8. I modified the example Mark made [1] and ran it through js (SpiderMonkey) and node (V8) to observe some of the differences of how they handle NaN. Neither could be pwned using the specified method. In V8, the difference is observable only if you assign the funny NaN directly to the array (i.e. it doesn't go through a variable or stuff like that). In SM, the difference is more observable, i.e. the bit pattern gets transferred even if you assign it to a variable in between, but no observable enough to make pwning possible. Of course, feel free to fork the gist and show me how it can be exploited. :)

Regardless, as per Dmitry's observations, I don't think the performance hit can be dismissed, and I doubt it can be optimized away to a level that could be dismissed.

I think standardizing whatever V8 is doing with NaN right now seems like the best option.

Cheers, Jussi

[1] gist.github.com/jussi-kalliokoski/5252226

# Brendan Eich (11 years ago)

Jussi Kalliokoski wrote:

I think standardizing whatever V8 is doing with NaN right now seems like the best option.

No, we cannot preserve NaN "payloads" in other engines that nanbox or pwnage ensues.

The spec leaves this as an underspecified interop hazard and that seems to be the best option.

# Oliver Hunt (11 years ago)

On Mar 27, 2013, at 7:53 PM, Jussi Kalliokoski <jussi.kalliokoski at gmail.com> wrote:

Given that we can't guarantee that the bit pattern will remain unchanged the spec should mandate normalizing to the non-signalling NaN.

--Oliver

It's not trivially exploitable, at least not in SM or V8. I modified the example Mark made [1] and ran it through js (SpiderMonkey) and node (V8) to observe some of the differences of how they handle NaN. Neither could be pwned using the specified method. In V8, the difference is observable only if you assign the funny NaN directly to the array (i.e. it doesn't go through a variable or stuff like that). In SM, the difference is more observable, i.e. the bit pattern gets transferred even if you assign it to a variable in between, but no observable enough to make pwning possible. Of course, feel free to fork the gist and show me how it can be exploited. :)

Of course it's not trivially exploitable in V8, SM, or JSC - You even say that you can see the different NaNs come and go. In general we try not to ship trivially exploitable code.

To my knowledge all engines convert signaling NaNs to a safe non signaling NaN on load. That is an absolutely unavoidable cost given the untyped backing stores that typed arrays insist on.

If you were to say "signaling NaNs must be preserved" modern engines would have to essentially drop NaN boxing, and take a huge hit to general performance and memory use. The alternative would be a land of trivial exploitation - If you want to see how bad it would be you'll need to modify your engine of choice to stop it from performing NaN canonicalization on read, and then see what happens when you do stuff like

doubleArray[indexOfBogusNaN].toString()

As an implementer I can tell you that you'll be able to make it crash. And exploiting it would not be too much harder.