endianness (was: Observability of NaN distinctions — is this a concern?)

# David Herman (12 years ago)

[breaking out a new thread since this is orthogonal to the NaN issue]

While the Khronos spec never specified an endianness, TC39 agreed in May 2012 to make the byte order explicitly little-endian in ES6:

https://mail.mozilla.org/pipermail/es-discuss/2012-May/022834.html

The de facto reality is that there are essentially no big-endian browsers for developers to test on. Web content is being invited to introduce byte-order dependencies. DataView is usually held up as the counter-argument, as if the existence of a safe alternative API means no one will ever misuse the unsafe one. Even if we don't take into account human nature, Murphy's Law, and the fact that the web is the world's largest fuzz tester, a wholly rational developer may often prefer not to use DataView because it's still easier to read out bytes using [ ] notation instead of DataView methods.

I myself -- possibly the one person in the world who cares most about this issue! -- accidentally created a buggy app that wouldn't work on a big-endian system, because I had no way of testing it:

https://github.com/dherman/float.js/commit/deb5bf2f5696ce29d9a6c1a6bf7c479a3784fd7b

In summary: we already agreed on TC39 to close this loophole, it's the right thing to do, and concern about potential performance issues on non-existent browsers of non-existent systems should not trump portability and correctly executing existing web content.

Dave

This was not the plan of record

# Kenneth Russell (12 years ago)

On Tue, Mar 26, 2013 at 4:35 PM, David Herman <dherman at mozilla.com> wrote:

[breaking out a new thread since this is orthogonal to the NaN issue]

While the Khronos spec never specified an endianness, TC39 agreed in May 2012 to make the byte order explicitly little-endian in ES6:

https://mail.mozilla.org/pipermail/es-discuss/2012-May/022834.html

The de facto reality is that there are essentially no big-endian browsers for developers to test on. Web content is being invited to introduce byte-order dependencies. DataView is usually held up as the counter-argument, as if the existence of a safe alternative API means no one will ever misuse the unsafe one. Even if we don't take into account human nature, Murphy's Law, and the fact that the web is the world's largest fuzz tester, a wholly rational developer may often prefer not to use DataView because it's still easier to read out bytes using [ ] notation instead of DataView methods.

I myself -- possibly the one person in the world who cares most about this issue! -- accidentally created a buggy app that wouldn't work on a big-endian system, because I had no way of testing it:

https://github.com/dherman/float.js/commit/deb5bf2f5696ce29d9a6c1a6bf7c479a3784fd7b

In summary: we already agreed on TC39 to close this loophole, it's the right thing to do, and concern about potential performance issues on non-existent browsers of non-existent systems should not trump portability and correctly executing existing web content.

I am disappointed that this decision was made without input from both of the editors of the typed array spec and disagree with the statement that it is the right thing to do.

# Andreas Rossberg (12 years ago)

On 27 March 2013 00:35, David Herman <dherman at mozilla.com> wrote:

[breaking out a new thread since this is orthogonal to the NaN issue]

While the Khronos spec never specified an endianness, TC39 agreed in May 2012 to make the byte order explicitly little-endian in ES6:

https://mail.mozilla.org/pipermail/es-discuss/2012-May/022834.html

The de facto reality is that there are essentially no big-endian browsers for developers to test on. Web content is being invited to introduce byte-order dependencies. DataView is usually held up as the counter-argument, as if the existence of a safe alternative API means no one will ever misuse the unsafe one. Even if we don't take into account human nature, Murphy's Law, and the fact that the web is the world's largest fuzz tester, a wholly rational developer may often prefer not to use DataView because it's still easier to read out bytes using [ ] notation instead of DataView methods.

I myself -- possibly the one person in the world who cares most about this issue! -- accidentally created a buggy app that wouldn't work on a big-endian system, because I had no way of testing it:

https://github.com/dherman/float.js/commit/deb5bf2f5696ce29d9a6c1a6bf7c479a3784fd7b

In summary: we already agreed on TC39 to close this loophole, it's the right thing to do, and concern about potential performance issues on non-existent browsers of non-existent systems should not trump portability and correctly executing existing web content.

I missed that decision, and as much as I understand the reasoning, I think we need to work on it. There actually are (third-party) projects with ports of V8 and/or Chromium to big endian architectures. WebGL code should not break or become prohibitively expensive on them all of a sudden.

# Mark S. Miller (12 years ago)

"prohibitively" depends on your tolerance. Modern machines can usually do register-to-register byte order reversal rather speedily. Which big endian machines do you have in mind?

# David Herman (12 years ago)

On Mar 27, 2013, at 6:51 AM, Andreas Rossberg <rossberg at google.com> wrote:

There actually are (third-party) projects with ports of V8 and/or Chromium to big endian architectures.

It would be helpful to have more information about what these platforms and projects are.

WebGL code should not break or become prohibitively expensive on them all of a sudden.

But WebGL code doesn't break either way. It's if we don't mandate little-endian that code breaks. As for the expense, it has to be weighed against content breaking. Not to mention the burden it places on developers to write portable code without even having big-endian user agents to test on. (I suppose they could use DI and shim the typed array constructors with simulated big-endian versions. Ugh...)

# Vladimir Vukicevic (12 years ago)

(Apologies for breaking threading -- subscribed too late to have original message to reply to.)

David Herman wrote:

On Mar 27, 2013, at 6:51 AM, Andreas Rossberg <rossberg at google.com, mail.mozilla.org/listinfo/es-discuss> wrote:

/ There actually are (third-party) projects />/ with ports of V8 and/or Chromium to big endian architectures. / It would be helpful to have more information about what these platforms and projects are.

The Wii-U is probably the highest profile example of this; PowerPC base, and they're doing a bunch of HTML5 apps stuff like what they announced at GDC.

/ WebGL />/ code should not break or become prohibitively expensive on them all of />/ a sudden. / But WebGL code doesn't break either way. It's if we don't mandate little-endian that code breaks. As for the expense, it has to be weighed against content breaking. Not to mention the burden it places on developers to write portable code without even having big-endian user agents to test on. (I suppose they could use DI and shim the typed array constructors with simulated big-endian versions. Ugh...)

The problem, as I see it at least, is that if little-endian is mandated, then effectively we have broken WebGL, or the possibility o ever having performant WebGL on those platforms. The underlying OpenGL API will always be native-endian. While big-endian processors often do have ways of efficiently doing byte-swapped loads and stores, that doesn't apply to passing bulk data down. For example, vertex skinning is the base method of doing skeletal animation. It's often done on the CPU, and it involves transforming a bunch of floating point data. The result is then uploaded to the GPU for rendering.

If typed arrays are fixed to be little endian, that means that on big endian platforms one of two things will need to happen:

  • the application will need to manually byte swap (in JS) by aliasing the Float32Array as a UInt8Array and twiddling bytes.
  • the WebGL implementation will need to make a copy of every incoming >

1 byte element size buffer and do the byte swapping before passing it down to GL -- it can either allocate a second buffer, swap into it, and then throw it away, or it can swap before the GL call and unswap after the GL call in-place.

Both of these are essentially murder for performance; so by attempting to prevent code from breaking you're basically guaranteeing that all code will effectively break due to performance -- except that developers have no option to write portable and performant code.

The other thing is, I suspect that a large chunk of code using typed arrays today will work just fine on big endian platforms, provided that the arrays are defined to be native-endian. Very few code actually aliases and does munging; the only issues might come up with are loading data from the network, but those are present regardless.

# Kevin Gadd (12 years ago)

One could also argue that people using typed arrays to alias and munge individual values should be using DataView instead. If it performs poorly, that can hopefully be addressed in the JS runtimes (the way it's specified doesn't seem to prevent it from being efficient).

# Kenneth Russell (12 years ago)

On Sun, Mar 31, 2013 at 1:42 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:

One could also argue that people using typed arrays to alias and munge individual values should be using DataView instead. If it performs poorly, that can hopefully be addressed in the JS runtimes (the way it's specified doesn't seem to prevent it from being efficient).

Agreed. DataView's methods are all simple and should be easy to optimize. Because they include a conditional byte swap, they can't run quite as fast as the typed arrays' accessors -- but they shouldn't need to. DataView was designed to support file and network I/O, where throughput is limited by the disk or network connection. The typed array views were designed for in-memory assembly of data to be submitted to the graphics card, sound card, etc., and must run as fast as possible.

-kg

On Sun, Mar 31, 2013 at 1:37 PM, Vladimir Vukicevic <vladimir at mozilla.com> wrote:

(Apologies for breaking threading -- subscribed too late to have original message to reply to.)

David Herman wrote:

On Mar 27, 2013, at 6:51 AM, Andreas Rossberg <rossberg at google.com> wrote:

There actually are (third-party) projects with ports of V8 and/or Chromium to big endian architectures.

It would be helpful to have more information about what these platforms and projects are.

The Wii-U is probably the highest profile example of this; PowerPC base, and they're doing a bunch of HTML5 apps stuff like what they announced at GDC.

WebGL code should not break or become prohibitively expensive on them all of a sudden.

But WebGL code doesn't break either way. It's if we don't mandate little-endian that code breaks. As for the expense, it has to be weighed against content breaking. Not to mention the burden it places on developers to write portable code without even having big-endian user agents to test on. (I suppose they could use DI and shim the typed array constructors with simulated big-endian versions. Ugh...)

The problem, as I see it at least, is that if little-endian is mandated, then effectively we have broken WebGL, or the possibility o ever having performant WebGL on those platforms. The underlying OpenGL API will always be native-endian. While big-endian processors often do have ways of efficiently doing byte-swapped loads and stores, that doesn't apply to passing bulk data down. For example, vertex skinning is the base method of doing skeletal animation. It's often done on the CPU, and it involves transforming a bunch of floating point data. The result is then uploaded to the GPU for rendering.

If typed arrays are fixed to be little endian, that means that on big endian platforms one of two things will need to happen:

  • the application will need to manually byte swap (in JS) by aliasing the Float32Array as a UInt8Array and twiddling bytes.
  • the WebGL implementation will need to make a copy of every incoming > 1 byte element size buffer and do the byte swapping before passing it down to GL -- it can either allocate a second buffer, swap into it, and then throw it away, or it can swap before the GL call and unswap after the GL call in-place.

If the typed array views are specified to be little-endian, the latter is what would need to be done in order to allow existing content to run. I agree that this would make WebGL unusably slow on big-endian architectures.

# Andreas Rossberg (12 years ago)

On 28 March 2013 21:42, Mark S. Miller <erights at google.com> wrote:

"prohibitively" depends on your tolerance. Modern machines can usually do register-to-register byte order reversal rather speedily. Which big endian machines do you have in mind?

For WebGL, which expects native endianness on its end, you'll have to convert whole arrays, not just single values. AFAICS, that would be costly enough to basically make WebGL useless on these platforms.

# Andreas Rossberg (12 years ago)

On 28 March 2013 23:01, David Herman <dherman at mozilla.com> wrote:

On Mar 27, 2013, at 6:51 AM, Andreas Rossberg <rossberg at google.com> wrote:

There actually are (third-party) projects with ports of V8 and/or Chromium to big endian architectures.

It would be helpful to have more information about what these platforms and projects are.

Yeah, unfortunately I can't say without being killed by those third parties.

WebGL code should not break or become prohibitively expensive on them all of a sudden.

But WebGL code doesn't break either way. It's if we don't mandate little-endian that code breaks.

No. WebGL expects native endianness today, and existing JS typed array code produces native endianness today. This is true on both little and big endian platforms. So existing code will typically run on both classes of platforms if it does not do something dirty. However, if the new spec suddenly mandates some platforms to switch endianness on one end, then existing code will no longer run as is on those platforms. See also Vlad's reply.