Non-extensibility of Typed Arrays

# Oliver Hunt (11 years ago)

So I noticed in the last revision to the TA spec that a decision was made to prevent extensibility of them. Can someone say why that decision was made? It makes TAs somewhat unique vs. all other builtin types and doesn't match the behavior of Blink or WebKit implementations.

While I am usually in favour of conservative behaviour I'd like more information on the reasoning behind this choice.

# Allen Wirfs-Brock (11 years ago)
# Oliver Hunt (11 years ago)

The curent argument for non-extensibility seems to be mozilla doesn't support them. It sounds like all other engines do.

There are plenty of reasons developers may want expandos - they're generally useful for holding different kinds of metadata. By requiring a separate object to hold that information we're merely making a developer's life harder. This is also inconsistent with all other magically-indexable types in ES and the DOM.

I'm also not sure what the performance gains of inextensibility are, if DH could expand it would be greatly appreciated.

# Domenic Denicola (11 years ago)

I am not aware of all the nuances of the discussion, but as a developer I would find the behavior for numeric expandos confusing. For a typed array of length 1024, setting ta[1023] would do something completely different from setting ta[1024]. Unlike normal arrays, setting ta[1024] would not change ta.length, and presumably ta[1024] would not be exposed by the various iteration facilities.

I would much rather received a loud error (in strict mode), which will either alert me to my code being weird, or possibly to my code committing an off-by-one error.

# Oliver Hunt (11 years ago)

Existing types with magic index properties (other than Array) just drop numeric expandos on the floor so it's logically a no-op. Unless there was a numeric accessor on the prototype (which non-extensibility does not save you from).

My complaint is that this appears to be removing functionality that has been present in the majority of shipping TA implementations, assuming from LH's comment that Chakra supports expandos.

# Brendan Eich (11 years ago)

On Aug 27, 2013, at 9:35 AM, Oliver Hunt <oliver at apple.com> wrote:

Existing types with magic index properties (other than Array) just drop numeric expandos on the floor so it's logically a no-op. Unless there was a numeric accessor on the prototype (which non-extensibility does not save you from).

Those are a problem and an anti-use-case.

My complaint is that this appears to be removing functionality that has been present in the majority of shipping TA implementations, assuming from LH's comment that Chakra supports expandos

Does anyone care, though?

TA instances having no indexed expandos but allowing named ones is weird. Better to be consistent to users and help implementations optimize further.

# Allen Wirfs-Brock (11 years ago)

On Aug 27, 2013, at 9:26 AM, Domenic Denicola wrote:

I am not aware of all the nuances of the discussion, but as a developer I would find the behavior for numeric expandos confusing. For a typed array of length 1024, setting ta[1023] would do something completely different from setting ta[1024]. Unlike normal arrays, setting ta[1024] would not change ta.length, and presumably ta[1024] would not be exposed by the various iteration facilities.

I would much rather received a loud error (in strict mode), which will either alert me to my code being weird, or possibly to my code committing an off-by-one error.

Integer numeric expandos on TypedArrays (eg, outside the range 0..length-1) are disallowed by the ES6 spec. in a manner that is independent of the[[Extensible]] internal property. The discussion at the meeting was about non-numeric expandos such as 'foo'.

# Filip Pizlo (11 years ago)

On Aug 27, 2013, at 9:39 AM, Brendan Eich <brendan at mozilla.com> wrote:

On Aug 27, 2013, at 9:35 AM, Oliver Hunt <oliver at apple.com> wrote:

Existing types with magic index properties (other than Array) just drop numeric expandos on the floor so it's logically a no-op. Unless there was a numeric accessor on the prototype (which non-extensibility does not save you from).

Those are a problem and an anti-use-case.

But they won't change anytime soon, will they?

So being inconsistent is weird.

My complaint is that this appears to be removing functionality that has been present in the majority of shipping TA implementations, assuming from LH's comment that Chakra supports expandos

Does anyone care, though?

I do. Placing named properties on arrays makes sense. Consider a matrix implemented as a Float32Array, with named properties telling you the numRows and numCols. Just one example.

TA instances having no indexed expandos but allowing named ones is weird. Better to be consistent to users

Consistency would imply doing what other indexed types do.

and help implementations optimize further.

I'm not convinced by this. We support named properties and our typed arrays are pretty well optimized in space (three words of overhead) and time (everything gets inlined including allocation). If there is some amazing optimization that non-expansion gives you, and it's so important that the spec needs to account for it, then I'd love to hear what that is.

# Domenic Denicola (11 years ago)

Integer numeric expandos on TypedArrays (eg, outside the range 0..length-1) are disallowed by the ES6 spec. in a manner that is independent of the[[Extensible]] internal property. The discussion at the meeting was about non-numeric expandos such as 'foo'.

Oh. That's just weird O_o.

# Allen Wirfs-Brock (11 years ago)

On Aug 27, 2013, at 9:43 AM, Allen Wirfs-Brock wrote:

Integer numeric expandos on TypedArrays (eg, outside the range 0..length-1) are disallowed by the ES6 spec. in a manner that is independent of the[[Extensible]] internal property. The discussion at the meeting was about non-numeric expandos such as 'foo'.

To clarify, out of range [[Get]] returns undefined and out of range [[Put]] is either a noop or a throw depending upon the strictness of code doing the [[Put]] (ie, normal [[Put]] strict behavior)

# Mark S. Miller (11 years ago)

On Tue, Aug 27, 2013 at 9:35 AM, Oliver Hunt <oliver at apple.com> wrote:

Existing types with magic index properties (other than Array) just drop numeric expandos on the floor so it's logically a no-op.

Dropping assignments silently is a bug, as it allows code to innocently proceed on control flow paths that assume success. That's why strict-mode turned failed assignments into thrown errors.

Unless there was a numeric accessor on the prototype (which non-extensibility does not save you from).

My complaint is that this appears to be removing functionality that has been present in the majority of shipping TA implementations, assuming from LH's comment that Chakra supports expandos.

"majority" is not a relevant constraint. We should try to make the best decision we can that is compatible with the cross browser web. If all the major browsers already agreed on one behavior, then we should only consider deviating from it with great caution. But so long as the major browsers differ, we need only feel constrained by compatibility with their intersection. This principle even overrides compatibility with previous versions of our own spec, as just discussed re [[Invoke]].

# David Herman (11 years ago)

On Aug 27, 2013, at 9:47 AM, Filip Pizlo <fpizlo at apple.com> wrote:

I do. Placing named properties on arrays makes sense. Consider a matrix implemented as a Float32Array, with named properties telling you the numRows and numCols. Just one example.

There are of course other ways to achieve this that don't involve patching the array object, such as building a data abstraction for matrices that has-a Float32Array, or creating a new array type with additional methods:

var Matrix = new ArrayType(float32);
Matrix.prototype.numRows = function() { ... }
// or
Object.defineProperty(Matrix.prototype, { get: function() { ... }, ... });

TA instances having no indexed expandos but allowing named ones is weird. Better to be consistent to users

Consistency would imply doing what other indexed types do.

Consistency arguments won't get you very far. The indexed properties of typed arrays by design act very differently from other indexed types. That's their whole reason for existence.

And the other consistency dimension is between array types and struct types. Is anyone arguing that structs should also have expandos?

# K. Gadd (11 years ago)

To me the compelling argument against using encapsulation instead of extensibility is that it breaks compatibility with existing JS code. Once you encapsulate an array, the encapsulated object no longer acts like an array and you can't use it in contexts where a normal array is expected. The ability to do python style 'quacks like an array' duck typing simply doesn't exist for arrays in JS.

This is a huge problem for JSIL interop - I can't preserve type information for arrays, or expose other array features, without either breaking interop with pure JS or otherwise eating some enormous perf hit (proxies, spidermonkey's deopt from named slots on arrays, etc). Baking this limitation into the spec for typed arrays is kind of awful, but I can understand if it's absolutely necessary...

Maybe WeakMap is the right solution for this? I can't remember what the performance consequences are for that use case. (Can you use an Array as a weakmap key? I forget, since it's an object-like type but it has special properties...)

Note that I'm not arguing for array subclassing here, just the ability to 'bless' an array instance with extra information. Such use cases are no doubt fairly rare, even if it's possible to come up with a handful of them.

I assume StructType and ArrayType will address a lot of this, but I'm not sure how I feel about having to wait for those features when (were typed arrays specced to allow named expandos) you could do this stuff in a mostly cross-browser way and ship it right now. (WeakMap fails this test since IIRC it's still only available in Firefox. :/ I love it and wish I could use it in the wild!)

# Allen Wirfs-Brock (11 years ago)

On Aug 27, 2013, at 3:49 PM, David Herman wrote:

On Aug 27, 2013, at 9:47 AM, Filip Pizlo <fpizlo at apple.com> wrote:

I do. Placing named properties on arrays makes sense. Consider a matrix implemented as a Float32Array, with named properties telling you the numRows and numCols. Just one example.

There are of course other ways to achieve this that don't involve patching the array object, such as building a data abstraction for matrices that has-a Float32Array, or creating a new array type with additional methods:

var Matrix = new ArrayType(float32);
Matrix.prototype.numRows = function() { ... }
// or
Object.defineProperty(Matrix.prototype, { get: function() { ... }, ... });

or even better:

class Matrix extends Float32Array {
    get numRows() {...}
    ...
}

although "Matrix" may be a bad example...

Subclasses of Typed Arrays get their own prototype that can add or over-ride inherited methods. The instances of the subclass are still non-extensible according to the current spec. draft.

# Mark S. Miller (11 years ago)

On Tue, Aug 27, 2013 at 4:14 PM, K. Gadd <kg at luminance.org> wrote:

To me the compelling argument against using encapsulation instead of extensibility is that it breaks compatibility with existing JS code. Once you encapsulate an array, the encapsulated object no longer acts like an array and you can't use it in contexts where a normal array is expected. The ability to do python style 'quacks like an array' duck typing simply doesn't exist for arrays in JS.

This is a huge problem for JSIL interop - I can't preserve type information for arrays, or expose other array features, without either breaking interop with pure JS or otherwise eating some enormous perf hit (proxies, spidermonkey's deopt from named slots on arrays, etc). Baking this limitation into the spec for typed arrays is kind of awful, but I can understand if it's absolutely necessary...

Maybe WeakMap is the right solution for this? I can't remember what the performance consequences are for that use case. (Can you use an Array as a weakmap key?

Yes.

I forget, since it's an object-like type but it has special properties...)

A weakmap key has an unforgeable per-act-of-creation identity, which is the only requirement. Arrays pass. Strings fail. Interestingly, if we provide a system-wide interning table from strings to symbols, then internable symbols fail. Else, unique symbols pass, but have all the problems previously enumerated.

# Oliver Hunt (11 years ago)

On Aug 27, 2013, at 3:49 PM, David Herman <dherman at mozilla.com> wrote:

On Aug 27, 2013, at 9:47 AM, Filip Pizlo <fpizlo at apple.com> wrote:

I do. Placing named properties on arrays makes sense. Consider a matrix implemented as a Float32Array, with named properties telling you the numRows and numCols. Just one example.

There are of course other ways to achieve this that don't involve patching the array object, such as building a data abstraction for matrices that has-a Float32Array, or creating a new array type with additional methods:

var Matrix = new ArrayType(float32); Matrix.prototype.numRows = function() { ... } // or Object.defineProperty(Matrix.prototype, { get: function() { ... }, ... });

So what is the answer for jQuery like libraries that want to be able to add metadata?

It's possible (if you want) to preventExtensions() on any type, but you can't undo it.

TA instances having no indexed expandos but allowing named ones is weird. Better to be consistent to users

Consistency would imply doing what other indexed types do.

Consistency arguments won't get you very far. The indexed properties of typed arrays by design act very differently from other indexed types. That's their whole reason for existence.

And the other consistency dimension is between array types and struct types. Is anyone arguing that structs should also have expandos?

No, but i would expect expandos to be possible on an Array of them. The same argument being made in favor of preventExtensions() on TAs applies to all new types in ES6 -- why should i be able to add expandos to a Map or any other type? (Map is particularly severe given the overloaded nature of [] in other languages and often "correctish" enough behavior of toString() in ES, e.g. m=new Map; m[someInt]=foo; … m[someInt])

# Steve Fink (11 years ago)

On 08/27/2013 09:35 AM, Oliver Hunt wrote:

My complaint is that this appears to be removing functionality that has been present in the majority of shipping TA implementations, assuming from LH's comment that Chakra supports expandos.

Note that even in the engines that support expandos, they will probably not survive a structured clone. I just tried in Chrome and they get stripped off. This further limits their utility in today's Web.

# Filip Pizlo (11 years ago)

Here's the part that gets me, though: what is the value of disallowing named properties on typed arrays? Who does this help?

I don't quite buy that this helps users; most of the objects in your program are going to allow custom properties to be added at any point. That's kind of the whole point of programming in a dynamic language. So having one type where it's disallowed doesn't help to clarify thinking.

I also don't buy that it makes anything more efficient. We only incur overhead from named properties if you actually add named properties to a typed array, and in that case we incur roughly the overhead you'd expect (those named properties are a touch slower than named properties on normal objects, and you obviously need to allocate some extra space to store those named properties).

# Allen Wirfs-Brock (11 years ago)

This thread has convinced my that Typed Arrays should be born extensible.

Actually, my subclass example in the thread started me down that path. In many cases where you might subclass an array you will want to add per instance state. You can expose a getter/setter on the prototype but the state still needs to be associated with the individual instances. Expando properties (or even better properties added in the @@create method) are the most natural way to represent that state.

The Firefox implementors will make this change if it represents TC39 consensus.

I'll put this item on the agenda for the next meeting and see if we can agree on extensible Typed Arrays.

# Mark S. Miller (11 years ago)

Dave Herman's "And the other consistency dimension is between array types and struct types. Is anyone arguing that structs should also have expandos?" surprised me, and convinced me of the opposite conclusion. Do you think instances of struct types should be extensible?

# Brendan Eich (11 years ago)

Filip Pizlo <mailto:fpizlo at apple.com> August 28, 2013 11:01 PM Here's the part that gets me, though: what is the value of disallowing named properties on typed arrays? Who does this help?

You've heard about symmetry with struct types (ES6), right? Those do not want expandos. We could break symmetry but at some cost. Too small to worry about? Outweighed by benefits?

Sfink's point about structured clone is good, except he wrote "structured clone" and then angels cried... tears of blood.

I don't quite buy that this helps users; most of the objects in your program are going to allow custom properties to be added at any point. That's kind of the whole point of programming in a dynamic language. So having one type where it's disallowed doesn't help to clarify thinking.

There are other such types a-coming :-).

I also don't buy that it makes anything more efficient. We only incur overhead from named properties if you actually add named properties to a typed array, and in that case we incur roughly the overhead you'd expect (those named properties are a touch slower than named properties on normal objects, and you obviously need to allocate some extra space to store those named properties).

Honest q: couldn't you squeeze one more word out if JSC typed arrays were non-extensible?

# Allen Wirfs-Brock (11 years ago)

On Aug 30, 2013, at 8:53 AM, Mark S. Miller wrote:

Dave Herman's "And the other consistency dimension is between array types and struct types. Is anyone arguing that structs should also have expandos?" surprised me, and convinced me of the opposite conclusion. Do you think instances of struct types should be extensible?

I think the right-way to think about structs is as an record structure with no properties fixed behavior provided by a "wrapper". Very similar to the ES primitives except that structs can be mutable. The way to associate properties with structs is to encapsulate them in an object, preferably via a class definition. If we go that route we can reach the point where ES classes have fixed-shape internal state defined as-if by a struct.

Typed Arrays are a different beast that already exist in the real world. I don't see any need for consistency between Typed Arrays and struct types. Consistency between Typed Arrays and Array is more important.

# David Herman (11 years ago)

On Aug 30, 2013, at 9:39 AM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

I think the right-way to think about structs is as an record structure with no properties fixed behavior provided by a "wrapper". Very similar to the ES primitives except that structs can be mutable. The way to associate properties with structs is to encapsulate them in an object, preferably via a class definition. If we go that route we can reach the point where ES classes have fixed-shape internal state defined as-if by a struct.

I might give a slightly different angle on this, and describe structs as objects with a fixed template for their own properties. They are still objects, they still inherit from prototypes. But they have a predefined set of own properties.

Typed Arrays are a different beast that already exist in the real world. I don't see any need for consistency between Typed Arrays and struct types. Consistency between Typed Arrays and Array is more important.

Mostly agreed, except I'd just refine that to say there's no need for consistency in this dimension. It would be a shame if typed arrays weren't generalized by the typed objects API in general, and I worked hard to make the pieces fit together. That nuance aside, the fact that, in practice, arrays are patched with additional properties (in fact, IIRC the ES6 template strings API adds properties to arrays) suggests that non-extensibility would be a real incompatibility between arrays and typed arrays. So I'm cool with making typed arrays -- but not structs -- extensible.

# Oliver Hunt (11 years ago)

On Aug 30, 2013, at 10:13 AM, David Herman <dherman at mozilla.com> wrote:

On Aug 30, 2013, at 9:39 AM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

I think the right-way to think about structs is as an record structure with no properties fixed behavior provided by a "wrapper". Very similar to the ES primitives except that structs can be mutable. The way to associate properties with structs is to encapsulate them in an object, preferably via a class definition. If we go that route we can reach the point where ES classes have fixed-shape internal state defined as-if by a struct.

I might give a slightly different angle on this, and describe structs as objects with a fixed template for their own properties. They are still objects, they still inherit from prototypes. But they have a predefined set of own properties.

Typed Arrays are a different beast that already exist in the real world. I don't see any need for consistency between Typed Arrays and struct types. Consistency between Typed Arrays and Array is more important.

Mostly agreed, except I'd just refine that to say there's no need for consistency in this dimension. It would be a shame if typed arrays weren't generalized by the typed objects API in general, and I worked hard to make the pieces fit together. That nuance aside, the fact that, in practice, arrays are patched with additional properties (in fact, IIRC the ES6 template strings API adds properties to arrays) suggests that non-extensibility would be a real incompatibility between arrays and typed arrays. So I'm cool with making typed arrays -- but not structs -- extensible.

I think of TypedArrays as being Arrays of structs with a fixed type/shape - the Array itself is a regular object with regular property characteristics, whereas the individual elements are all value types .

For example, say i have a struct type S, and make a regular Array filled with S. Aside from the poor performance, this is now essentially what a typed array of structs is. What is the reason for making the fast version of an array of structs lose the features of a regular array filled with structs?

# Filip Pizlo (11 years ago)

On Aug 30, 2013, at 9:28 AM, Brendan Eich <brendan at mozilla.com> wrote:

Hi,

Filip Pizlo <mailto:fpizlo at apple.com> August 28, 2013 11:01 PM Here's the part that gets me, though: what is the value of disallowing named properties on typed arrays? Who does this help?

You've heard about symmetry with struct types (ES6), right? Those do not want expandos. We could break symmetry but at some cost. Too small to worry about? Outweighed by benefits?

It's a fair point. I don't see where it would break semantics but I'll try to do a thought experiment to see if it makes things confusing or inconvenient to the programmer. Whether or not I care depends on the answers to the following questions:

  1. Is the purpose to simplify programming by allowing you to add static typing?
  2. Are we trying to help JITs?
  3. Do we just want a sensible way of mapping to binary data? (For both DOM and C-to-JS compilers)

It appears that (1) is a non-goal; if it was a goal then we'd have a different aliasing story, we wouldn't have the byteOffset/byteLength/buffer properties, and there would be zero discussion of binary layout. We'd also bake the types deeper into the language. This doesn't simplify programming if you have to write code in a bifurcated world with both traditional JS objects (all dynamic, objects can point at each other, but the backing stores of objects don't alias each other) and binary objects (have some types to describe layout, but can't have arbitrary object graphs, and backing stores of distinct objects may alias each other).

(2) appears to be a bit more of a pie-in-the-sky dream than a goal. A decent JIT will already recognize idioms where the programmer created an object with a clear sequence of fields and then uses that object in a monomorphic way. Both 'function Constructor() { this.a = ...; this.b = ...; }' and '{a:..., b:...}' will get recognized, though some combination of run-time and compile-time analysis, as indicating that the user intends to have a type that has 'a' and 'b' as fields. It's true that binary data makes this explicit, but the JIT can fall apart in the same way as it does already for normal objects: the references to these objects tend to be untyped so the programmer can inadvertently introduce polymorphism and lose some (most?) of the benefits. Because binary data objects will have potentially aliased backing stores, you get the additional problem that you can't do any field-based aliasing analysis: for a normal JS object if I know that 'o.a' accesses own-property 'a' and it's not a getter/setter; and 'o.b' accesses own-property 'b' and it's not a getter/setter - then I know that these two accesses don't alias. For binary data, I don't quite have such a guarantee: 'a' can overlap 'b' in some other object. Also, the fact that a struct type instance might have to know about a buffer along with an offset into that buffer introduces a greater object size overhead than plain JS objects. A plain JS object needs roughly two pieces of overhead: something to identify the type and a pointer reserved for when you store more things into it. A struct type instance will need roughly three pieces of overhead: something to identify the type, a pointer to the buffer, and some indication of the offset within that buffer. The only performance win from struct types is probably that it gives you an explicit tuple flattening. That's kind of cool but I remember that C# had struct types while Java didn't and yet JVMs still killed .NET on any meaningful measure of performance.

So it appears that the most realistic goal is (3). In that case, I can't imagine a case where arrays being expandos but struct types being totally frozen will make the task of struct mapping to native code any harder. If you're a programmer who doesn't want a typed array to have custom properties, then you won't give it custom properties - simple as that. No need to enforce the invariant.

Sfink's point about structured clone is good, except he wrote "structured clone" and then angels cried... tears of blood.

I don't quite buy that this helps users; most of the objects in your program are going to allow custom properties to be added at any point. That's kind of the whole point of programming in a dynamic language. So having one type where it's disallowed doesn't help to clarify thinking.

There are other such types a-coming :-).

And I'll be grumpy about some of those, too. ;-)

I also don't buy that it makes anything more efficient. We only incur overhead from named properties if you actually add named properties to a typed array, and in that case we incur roughly the overhead you'd expect (those named properties are a touch slower than named properties on normal objects, and you obviously need to allocate some extra space to store those named properties).

Honest q: couldn't you squeeze one more word out if JSC typed arrays were non-extensible?

I'd love to hear about this from the SM and V8 peeps. Here's my take. A typed array must know about the following bits of information:

T: Its own type. B: A base pointer (not the buffer but the thing you index off of). L: Its length.

But that only works if it owns its buffer - that is it was allocated using for example "new Int8Array(100)" and you never used the .buffer property. So in practice you also need:

R: Reserved space for a pointer to a buffer.

Now observe that 'R' can be reused for either a buffer pointer or a pointer to overflow storage for named properties. If you have both a buffer and overflow storage, you can save room in the overflow storage for the buffer pointer (i.e. displace the buffer pointer into the property storage). We play a slightly less ambitious trick, where R either points to overflow storage or NULL. Most typed arrays don't have a .buffer, but once they get one, we allocate overflow storage and reserve a slow in there for the buffer pointer. So you pay one more word of overhead for typed arrays with buffers even if they don't have named properties. I think that's probably good enough - I mean, in that case, you have a freaking buffer object as well so you're not exactly conserving memory.

But, using R as a direct pointer to the buffer would be a simple hack if we really felt like saving one word when you also already have a separate buffer object.

I could sort of imagine going further and using T as a displaced pointer and saving an extra word, but that might make type checks more expensive, sometimes.

So lets do the math, on both 32-bit and 64-bit (where 64-bit implies 64-bit pointers), to see how big this would be.

32-bit:

T = 4 bytes, B = 4 bytes, L = 4 bytes, R = 4 bytes. So, you get 16 bytes of overhead for most typed arrays, and 20 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.

64-bit:

T = 8 bytes, B = 8 bytes, L = 4 bytes, R = 8 bytes. This implies you have 4 bytes to spare if you want objects 8-byte aligned (we do); we use this for some extra bookkeeping. So you get 32 bytes of overhead for most typed arrays, and 40 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.

As far as I can tell, this object model compresses typed arrays about as much as they could be compressed while also allowing them to be extensible. The downside is that you pay a small penalty for typed arrays that have an "active" buffer, in the sense that you either accessed the .buffer property or you constructed the typed array using a constructor that takes a buffer as an argument.

# Brendan Eich (11 years ago)

David Herman wrote:

Typed Arrays are a different beast that already exist in the real world. I don't see any need for consistency between Typed Arrays and struct types. Consistency between Typed Arrays and Array is more important.

Mostly agreed, except I'd just refine that to say there's no need for consistencyin this dimension. It would be a shame if typed arrays weren't generalized by the typed objects API in general, and I worked hard to make the pieces fit together. That nuance aside,

I think you are too kind :-|.

Allen, the point about typed arrays being different from structs because some implementations make the former extensible and the latter do not exist in any implementation yet is a just-so story, half of which is hypothetical! I could just as well argue from Firefox's non-extensible precedent if I wanted to.

The better argument is one that accounts for why structs are not extensible and how typed arrays differ, if they do differ, by design

# Brendan Eich (11 years ago)

Thanks for the reply, I'll let SM and V8 peeps speak for themselves (they retired my SM number ;-).

Filip Pizlo <mailto:fpizlo at apple.com> August 30, 2013 10:41 AM

On Aug 30, 2013, at 9:28 AM, Brendan Eich <brendan at mozilla.com <mailto:brendan at mozilla.com>> wrote:

Hi,

Filip Pizlo <mailto:fpizlo at apple.com> August 28, 2013 11:01 PM Here's the part that gets me, though: what is the value of disallowing named properties on typed arrays? Who does this help?

You've heard about symmetry with struct types (ES6), right? Those do not want expandos. We could break symmetry but at some cost. Too small to worry about? Outweighed by benefits?

It's a fair point. I don't see where it would break semantics but I'll try to do a thought experiment to see if it makes things confusing or inconvenient to the programmer. Whether or not I care depends on the answers to the following questions:

  1. Is the purpose to simplify programming by allowing you to add static typing?

No, we put a stake through that cold heart.

  1. Are we trying to help JITs?

Yes, I think so (SM retirement makes this easy for me to say ;-). Even excluding type inference as done in SpiderMonkey, just using PICs, structs over against objects can help JITs avoid boxing values, same as typed arrays do compared to Arrays.

Sometimes you want a product of different types, not a vector of same-typed elements. Typed arrays were designed so you would alias two views, crazypants. Structs put on sanepants. Just making sure the use-case has clear motivation here.

If so, then the JIT wins implemented today among multiple engines for typed array element loads and stores will almost certainly be wanted for struct field traffic too.

  1. Do we just want a sensible way of mapping to binary data? (For both DOM and C-to-JS compilers)

Yes, and don't forget the GPU as well ("DOM" doesn't take that in).

# Filip Pizlo (11 years ago)

On Aug 30, 2013, at 12:31 PM, Brendan Eich <brendan at mozilla.com> wrote:

Thanks for the reply, I'll let SM and V8 peeps speak for themselves (they retired my SM number ;-).

Filip Pizlo <mailto:fpizlo at apple.com> August 30, 2013 10:41 AM

On Aug 30, 2013, at 9:28 AM, Brendan Eich <brendan at mozilla.com <mailto:brendan at mozilla.com>> wrote:

Hi,

Filip Pizlo <mailto:fpizlo at apple.com> August 28, 2013 11:01 PM Here's the part that gets me, though: what is the value of disallowing named properties on typed arrays? Who does this help?

You've heard about symmetry with struct types (ES6), right? Those do not want expandos. We could break symmetry but at some cost. Too small to worry about? Outweighed by benefits?

It's a fair point. I don't see where it would break semantics but I'll try to do a thought experiment to see if it makes things confusing or inconvenient to the programmer. Whether or not I care depends on the answers to the following questions:

  1. Is the purpose to simplify programming by allowing you to add static typing?

No, we put a stake through that cold heart.

  1. Are we trying to help JITs?

Yes, I think so (SM retirement makes this easy for me to say ;-). Even excluding type inference as done in SpiderMonkey, just using PICs, structs over against objects can help JITs avoid boxing values, same as typed arrays do compared to Arrays.

This isn't really a win, at least not for us, anyway. We don't "box" values in the sense of allocating stuff in the heap; we only tag them. The tagging operations are just too darn cheap to worry about getting rid of them. For example, int untagging is basically free. Double untagging is not quite free but our double array inference (for normal JS arrays) is too darn good - you'd have to try quite hard to find a case where using a Float64Array gives you a real win over a JS array into which you only stored doubles. Once exception is that our double array inference for normal JS arrays fails if you store NaN. Our current philosophy towards that is "oh well" - it's not clear that this arises enough that we should care.

Sometimes you want a product of different types, not a vector of same-typed elements. Typed arrays were designed so you would alias two views, crazypants. Structs put on sanepants. Just making sure the use-case has clear motivation here.

OK - by "sanepants" do you mean that there is no weirdo aliasing? Going back to my example of field 'a' aliasing field 'b' - is it possible?

If so, then the JIT wins implemented today among multiple engines for typed array element loads and stores will almost certainly be wanted for struct field traffic too.

I think you're being too generous to the typed array optimizations. Vanilla JS arrays are catching up, or have already caught up and surpassed, depending on how you look at it.

It sure is tempting to add type thingies to help JITs but I think we're quickly approaching a world where adding primitive types to JS will be a bit like marking your Java methods final in the mistaken belief that it will unlock some extra devirtualization, or marking your variables 'register' in C thinking that this will make your code sooooper fast.

  1. Do we just want a sensible way of mapping to binary data? (For both DOM and C-to-JS compilers)

Yes, and don't forget the GPU as well ("DOM" doesn't take that in).

Right! I totally buy the native mapping story for struct types. I just don't buy the performance story. ;-)

# Brendan Eich (11 years ago)

Filip Pizlo wrote:

Sometimes you want a product of different types, not a vector of same-typed elements. Typed arrays were designed so you would alias two views, crazypants. Structs put on sanepants. Just making sure the use-case has clear motivation here.

OK - by "sanepants" do you mean that there is no weirdo aliasing? Going back to my example of field 'a' aliasing field 'b' - is it possible?

Summoning dherman here, but yes: sanepants in my book means no aliasing -- not just no aliasing required, no aliasing possible.

# David Herman (11 years ago)

On Aug 30, 2013, at 12:46 PM, Filip Pizlo <fpizlo at apple.com> wrote:

OK - by "sanepants" do you mean that there is no weirdo aliasing? Going back to my example of field 'a' aliasing field 'b' - is it possible?

There is plenty of aliasing possible, but I'm trying to understand what you mean specifically by "weirdo" aliasing. Do you mean that in a given struct it's impossible for it to have two fields that alias each other? That's definitely true. E.g., if I have a struct type

var T = new StructType({ a: t1, b: t2, ... });

then for any given instance x of T, I know for sure that x.a and x.b do not alias the same storage.

# David Herman (11 years ago)

On Aug 30, 2013, at 3:46 PM, David Herman <dherman at mozilla.com> wrote:

E.g., if I have a struct type

var T = new StructType({ a: t1, b: t2, ... });

then for any given instance x of T, I know for sure that x.a and x.b do not alias the same storage.

(Except, of course, if t1 and t2 are pointer types like Object.)

# Filip Pizlo (11 years ago)

On Aug 30, 2013, at 3:46 PM, David Herman <dherman at mozilla.com> wrote:

On Aug 30, 2013, at 12:46 PM, Filip Pizlo <fpizlo at apple.com> wrote:

OK - by "sanepants" do you mean that there is no weirdo aliasing? Going back to my example of field 'a' aliasing field 'b' - is it possible?

There is plenty of aliasing possible, but I'm trying to understand what you mean specifically by "weirdo" aliasing. Do you mean that in a given struct it's impossible for it to have two fields that alias each other? That's definitely true. E.g., if I have a struct type

var T = new StructType({ a: t1, b: t2, ... });

then for any given instance x of T, I know for sure that x.a and x.b do not alias the same storage.

Yup, that's what I was concerned about. And reading over the spec I agree. But just for sanity, we're guaranteeing this because you cannot create a struct type instance by pointing into an arbitrary offset of a buffer - you can only instantiate new ones, or alias structs nested as fields in other structs. Right?

# David Herman (11 years ago)

On Aug 30, 2013, at 3:54 PM, Filip Pizlo <fpizlo at apple.com> wrote:

Yup, that's what I was concerned about. And reading over the spec I agree. But just for sanity, we're guaranteeing this because you cannot create a struct type instance by pointing into an arbitrary offset of a buffer - you can only instantiate new ones, or alias structs nested as fields in other structs. Right?

Hm, I must be missing something obvious, but I don't see why you'd need that restriction to guarantee this. A struct type with two different fields guarantees they're at different offsets from the base:

var T = new StructType({
    a: int32, // offset 0
    b: int32  // offset 4
});

so even if I point an instance of T into the middle of a struct, x.a and x.b must be at different offsets.

# Filip Pizlo (11 years ago)

On Aug 30, 2013, at 4:03 PM, David Herman <dherman at mozilla.com> wrote:

On Aug 30, 2013, at 3:54 PM, Filip Pizlo <fpizlo at apple.com> wrote:

Yup, that's what I was concerned about. And reading over the spec I agree. But just for sanity, we're guaranteeing this because you cannot create a struct type instance by pointing into an arbitrary offset of a buffer - you can only instantiate new ones, or alias structs nested as fields in other structs. Right?

Hm, I must be missing something obvious, but I don't see why you'd need that restriction to guarantee this. A struct type with two different fields guarantees they're at different offsets from the base:

var T = new StructType({ a: int32, // offset 0 b: int32 // offset 4 });

so even if I point an instance of T into the middle of a struct, x.a and x.b must be at different offsets.

Sorry, that's not the issue. Consider this. In one part of the program I say:

var Point2D = new StructType({x:uint32, y:uint32});

And in another part of the program I encounter a function like this:

function foo(a, b) {
    var blah = a.x;
    b.y = 53;
    return blah + a.x; // does this return a.x * 2 or could it return a.x + 53?
}

Lets say that I've proven that a, b both refer to instances of Point2D. Can I be sure that this returns a.x * 2 or could it alternatively return a.x + 53?

A closely related case is:

var Point2D = new StructType({x:uint32, y:uint32});
var Vector = new StructType({x:uint32, y:uint32});

function foo(a, b) {
    var blah = a.x;
    b.x = 53;
    return blah + a.x; // does this return a.x * 2 or could it return a.x + 53?
}

Let's say that I've proven that a refers to a Point2D and b refers to a Vector. Can I be sure that this returns a.x * 2 or could it alternatively return a.x + 53?

The analogue in typed arrays is:

function foo(a, b) {
    var blah = a[0];
    b[1] = 53;
    return blah + b[0]; // does this return a[0] * 2 or could it return a[0] + 53, or something even weirder?
}

Even if I've proven that a, b are typed arrays - of either same or different type - this function could still return a[0] + 53 because the buffers may be weirdly aliased. In fact for anything other than 1-byte element typed arrays, the above function has endianness-dependent results, so it may return something other than either a[0] * 2 or a[0] + 53.

This is the kind of weirdness that I hope struct types don't have, if their alleged purpose is to help people optimize their code.

Now, I don't object to typed arrays having this behavior - it is what it is, and it's certainly useful for doing graphics type stuff. It's also indispensable for emscripten. And I'm OK with struct types also having this behavior; in fact I would expect them to have such behavior if they're supposed by help C-to-JS code generators or the like. But if we have such behavior then we ought not also claim that struct types are meant for optimization and helping JITs. It's particularly intriguing to me that if struct types are specified to allow such aliasing, then using struct types means you get less load elimination and hoisting than you would in regular untyped JS code. That's kind of a big deal considering that load elimination and hoisting are fundamental compiler optimizations.

# David Herman (11 years ago)

On Aug 30, 2013, at 4:20 PM, Filip Pizlo <fpizlo at apple.com> wrote:

This is the kind of weirdness that I hope struct types don't have, if their alleged purpose is to help people optimize their code.

This is a great point, thanks. On the one hand, I concluded long ago the exposure of the buffer seems like something we can't really avoid, since it's necessary for WebGL arrays-of-structs, which is a main use case. On the other hand, that doesn't necessarily mean we need the ability to overlay a struct type into random points in a buffer. We'd have to do this carefully, though: I believe we'd have to restrict overlaying to just the legacy typed array constructors, not to any new kinds of array types (since they may have structs nested inside them), in order to guarantee lack of aliasing. And then we'd want to make sure this covered the WebGL use cases.

Now, I don't object to typed arrays having this behavior - it is what it is, and it's certainly useful for doing graphics type stuff. It's also indispensable for emscripten. And I'm OK with struct types also having this behavior; in fact I would expect them to have such behavior if they're supposed by help C-to-JS code generators or the like.

Not really for C-to-JS, no. I do want them to be useful for e.g. Java-to-JS code generators, but those shouldn't need the casting.

# Filip Pizlo (11 years ago)

On Aug 30, 2013, at 4:42 PM, David Herman <dherman at mozilla.com> wrote:

On Aug 30, 2013, at 4:20 PM, Filip Pizlo <fpizlo at apple.com> wrote:

This is the kind of weirdness that I hope struct types don't have, if their alleged purpose is to help people optimize their code.

This is a great point, thanks. On the one hand, I concluded long ago the exposure of the buffer seems like something we can't really avoid, since it's necessary for WebGL arrays-of-structs, which is a main use case. On the other hand, that doesn't necessarily mean we need the ability to overlay a struct type into random points in a buffer. We'd have to do this carefully, though: I believe we'd have to restrict overlaying to just the legacy typed array constructors, not to any new kinds of array types (since they may have structs nested inside them), in order to guarantee lack of aliasing. And then we'd want to make sure this covered the WebGL use cases.

I think it's better if you pick one use case and get it totally right. You're not going to get the "optimize my JS code with types" use case right. So stick to the binary data mapping use case, and allow arbitrary aliasing.

# Brendan Eich (11 years ago)

Filip Pizlo wrote:

I think it's better if you pick one use case and get it totally right. You're not going to get the "optimize my JS code with types" use case right. So stick to the binary data mapping use case, and allow arbitrary aliasing.

I am on the same page; any quibbling from me about your question 2 has an answer no stronger than "the machine-type info doesn't hurt performance, and could help some engines".

It would hurt developers if they fell under the influence of some bad use-structs-for-speed cult, but (especially from what you say about JSC) this doesn't sound like a big risk.

# Dmitry Lomov (11 years ago)

(sorry for not getting on this thread earlier - I was off the grid for a bit)

I think we should consider fixed-length ArrayTypes in this discussion. They can be parts of structs. Consider

var A = new ArrayType(uint8, 10);
var S = new Struct({a : A});
var a = new A();
var s = new S();
a[0] = 10;
a.foo = "foo";
s.a = a;

Assignment to a struct field is a copy, essentially. Of course, s.a[0] is now 10. But does s.a.foo exist? In the current semantics, there is no place to store it, because a field 'a' of struct 'S' is just a storage designator

  • there is no "place" in struct s to store the expando properties of fields and fields of fields and fields of fields of fields....

Therefore in current semantics fixed-length ArrayTypes, just like StructTypes, are either non-expandable, or have to lose their expanded properties on assignments - big surprise for the user!

Now of course variable-sized ArrayTypes do not suffer from this issue, but one could argue for consistency with fixed-sized ArrayTypes.

# Andreas Rossberg (11 years ago)

On 4 September 2013 10:23, Dmitry Lomov <dslomov at google.com> wrote:

I think we should consider fixed-length ArrayTypes in this discussion. They can be parts of structs. Consider var A = new ArrayType(uint8, 10); var S = new Struct({a : A}); var a = new A(); var s = new S(); a[0] = 10; a.foo = "foo"; s.a = a; Assignment to a struct field is a copy, essentially. Of course, s.a[0] is now 10. But does s.a.foo exist? In the current semantics, there is no place to store it, because a field 'a' of struct 'S' is just a storage designator

  • there is no "place" in struct s to store the expando properties of fields and fields of fields and fields of fields of fields....

Therefore in current semantics fixed-length ArrayTypes, just like StructTypes, are either non-expandable, or have to lose their expanded properties on assignments - big surprise for the user!

Now of course variable-sized ArrayTypes do not suffer from this issue, but one could argue for consistency with fixed-sized ArrayTypes.

I was about to make the same point. :)

As part of binary data, typed arrays are implicitly constructed "on the fly" as views on a backing store. Any notion of identity -- which is the prerequisite for state -- is not particularly meaningful in this setting. Also, it is preferable to make them as lightweight as possible.

As for other typed arrays, the difference is subtle, and I'd rather go for consistency.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote:

On 4 September 2013 10:23, Dmitry Lomov <dslomov at google.com> wrote: I think we should consider fixed-length ArrayTypes in this discussion. They can be parts of structs. Consider var A = new ArrayType(uint8, 10); var S = new Struct({a : A}); var a = new A(); var s = new S(); a[0] = 10; a.foo = "foo"; s.a = a; Assignment to a struct field is a copy, essentially. Of course, s.a[0] is now 10. But does s.a.foo exist? In the current semantics, there is no place to store it, because a field 'a' of struct 'S' is just a storage designator

  • there is no "place" in struct s to store the expando properties of fields and fields of fields and fields of fields of fields....

Therefore in current semantics fixed-length ArrayTypes, just like StructTypes, are either non-expandable, or have to lose their expanded properties on assignments - big surprise for the user!

Now of course variable-sized ArrayTypes do not suffer from this issue, but one could argue for consistency with fixed-sized ArrayTypes.

I was about to make the same point. :)

As part of binary data, typed arrays are implicitly constructed "on the fly" as views on a backing store. Any notion of identity -- which is the prerequisite for state -- is not particularly meaningful in this setting.

Are you proposing changing how == and === work for typed arrays? If not then this whole argument is moot.

Also, it is preferable to make them as lightweight as possible.

See my previous mail. You gain zero space and zero performance from making typed arrays non extensible.

# Andreas Rossberg (11 years ago)

On 4 September 2013 16:44, Filip Pizlo <fpizlo at apple.com> wrote:

On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote: As part of binary data, typed arrays are implicitly constructed "on the fly" as views on a backing store. Any notion of identity -- which is the prerequisite for state -- is not particularly meaningful in this setting.

Are you proposing changing how == and === work for typed arrays? If not then this whole argument is moot.

No, they are just rather useless operations on data views. That doesn't make the argument moot.

Also, it is preferable to make them as lightweight as possible.

See my previous mail. You gain zero space and zero performance from making typed arrays non extensible.

I think you are jumping to conclusions. You can very well optimize the representation of typed arrays if they don't have user-defined properties. Whether that's worth it I can't tell without experiments. Admittedly, it's a minor point.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 7:55 AM, Andreas Rossberg <rossberg at google.com> wrote:

On 4 September 2013 16:44, Filip Pizlo <fpizlo at apple.com> wrote:

On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote: As part of binary data, typed arrays are implicitly constructed "on the fly" as views on a backing store. Any notion of identity -- which is the prerequisite for state -- is not particularly meaningful in this setting.

Are you proposing changing how == and === work for typed arrays? If not then this whole argument is moot.

No, they are just rather useless operations on data views. That doesn't make the argument moot.

The point is that as soon as you're using the copy '=' on binary data fields, you're already losing an observable notion of object identity. The '=' here is already unlike the '=' operator for languages that have true value types - in those languages you wouldn't be able to observe if you got the same typed array or a different one but with the same underlying data. In JS you will be able to observe this with '==' and '==='. Hence, being able to also observe that you got a different one because you lost some meta-data (i.e. custom named properties) doesn't change the fact that the quirky semantics were already observable to the user.

Also, it is preferable to make them as lightweight as possible.

See my previous mail. You gain zero space and zero performance from making typed arrays non extensible.

I think you are jumping to conclusions. You can very well optimize the representation of typed arrays if they don't have user-defined properties. Whether that's worth it I can't tell without experiments.

I don't think this is a matter of opinion. There is state that typed arrays are required to store but that is not accessed on the most critical of hot paths, which naturally allows us to play displaced pointer tricks.

It would also be useful, if you want to argue this point, if you replied to my previous discussion of why there is no performance difference between expandable and non-expandable typed arrays. I'll copy that here in case you missed it:

A typed array must know about the following bits of information:

T: Its own type. B: A base pointer (not the buffer but the thing you index off of). L: Its length.

But that only works if it owns its buffer - that is it was allocated using for example "new Int8Array(100)" and you never used the .buffer property. So in practice you also need:

R: Reserved space for a pointer to a buffer.

Now observe that 'R' can be reused for either a buffer pointer or a pointer to overflow storage for named properties. If you have both a buffer and overflow storage, you can save room in the overflow storage for the buffer pointer (i.e. displace the buffer pointer into the property storage). We play a slightly less ambitious trick, where R either points to overflow storage or NULL. Most typed arrays don't have a .buffer, but once they get one, we allocate overflow storage and reserve a slot in there for the buffer pointer. So you pay one more word of overhead for typed arrays with buffers even if they don't have named properties. I think that's probably good enough - I mean, in that case, you have a freaking buffer object as well so you're not exactly conserving memory.

But, using R as a direct pointer to the buffer would be a simple hack if we really felt like saving one word when you also already have a separate buffer object.

I could sort of imagine going further and using T as a displaced pointer and saving an extra word, but that might make type checks more expensive, sometimes.

So lets do the math, on both 32-bit and 64-bit (where 64-bit implies 64-bit pointers), to see how big this would be.

32-bit:

T = 4 bytes, B = 4 bytes, L = 4 bytes, R = 4 bytes. So, you get 16 bytes of overhead for most typed arrays, and 20 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.

64-bit:

T = 8 bytes, B = 8 bytes, L = 4 bytes, R = 8 bytes. This implies you have 4 bytes to spare if you want objects 8-byte aligned (we do); we use this for some extra bookkeeping. So you get 32 bytes of overhead for most typed arrays, and 40 if you need to use R as an overflow storage pointer and displace the buffer pointer into the overflow storage.

As far as I can tell, this object model compresses typed arrays about as much as they could be compressed while also allowing them to be extensible. The downside is that you pay a small penalty for typed arrays that have an "active" buffer, in the case that you either accessed the .buffer property or you constructed the typed array using a constructor that takes a buffer as an argument.

So, how big are your non-expanddable typed arrays, and what do they look like? If they're not smaller than 16 bytes in the common case with 32-bit pointers, or 32 bytes in the common case with 64-bit pointers, then there is no performance argument in favor of getting rid of expandable properties.

# Andreas Rossberg (11 years ago)

On 4 September 2013 17:11, Filip Pizlo <fpizlo at apple.com> wrote:

On Sep 4, 2013, at 7:55 AM, Andreas Rossberg <rossberg at google.com> wrote:

On 4 September 2013 16:44, Filip Pizlo <fpizlo at apple.com> wrote: On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote:

As part of binary data, typed arrays are implicitly constructed "on the fly" as views on a backing store. Any notion of identity -- which is the prerequisite for state -- is not particularly meaningful in this setting.

Are you proposing changing how == and === work for typed arrays? If not then this whole argument is moot.

No, they are just rather useless operations on data views. That doesn't make the argument moot.

The point is that as soon as you're using the copy '=' on binary data fields, you're already losing an observable notion of object identity. The '=' here is already unlike the '=' operator for languages that have true value types - in those languages you wouldn't be able to observe if you got the same typed array or a different one but with the same underlying data. In JS you will be able to observe this with '==' and '==='. Hence, being able to also observe that you got a different one because you lost some meta-data (i.e. custom named properties) doesn't change the fact that the quirky semantics were already observable to the user.

I didn't say it's unobservable -- every twist in the gut is observable in JavaScript. I said it's rather meaningless. That is, from a practical perspective, I'd rather not recommend relying on it, unless you are up for subtle and brittle code.

As far as I can tell, this object model compresses typed arrays about as much as they could be compressed while also allowing them to be extensible. The downside is that you pay a small penalty for typed arrays that have an "active" buffer, in the case that you either accessed the .buffer property or you constructed the typed array using a constructor that takes a buffer as an argument.

I really don't feel like getting into this argument -- as I said it's a minor point. Just note that the optimisation you suggest might not be worth it in every VM (i.e., there could be a substantial impedance mismatch), and moreover, that the above case might be not-so-uncommon.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 9:03 AM, Andreas Rossberg <rossberg at google.com> wrote:

On 4 September 2013 17:11, Filip Pizlo <fpizlo at apple.com> wrote:

On Sep 4, 2013, at 7:55 AM, Andreas Rossberg <rossberg at google.com> wrote: On 4 September 2013 16:44, Filip Pizlo <fpizlo at apple.com> wrote:

On Sep 4, 2013, at 3:05 AM, Andreas Rossberg <rossberg at google.com> wrote: As part of binary data, typed arrays are implicitly constructed "on the fly" as views on a backing store. Any notion of identity -- which is the prerequisite for state -- is not particularly meaningful in this setting.

Are you proposing changing how == and === work for typed arrays? If not then this whole argument is moot.

No, they are just rather useless operations on data views. That doesn't make the argument moot.

The point is that as soon as you're using the copy '=' on binary data fields, you're already losing an observable notion of object identity. The '=' here is already unlike the '=' operator for languages that have true value types - in those languages you wouldn't be able to observe if you got the same typed array or a different one but with the same underlying data. In JS you will be able to observe this with '==' and '==='. Hence, being able to also observe that you got a different one because you lost some meta-data (i.e. custom named properties) doesn't change the fact that the quirky semantics were already observable to the user.

I didn't say it's unobservable -- every twist in the gut is observable in JavaScript. I said it's rather meaningless. That is, from a practical perspective, I'd rather not recommend relying on it, unless you are up for subtle and brittle code.

Are you saying that users shouldn't rely on == on objects? My concern here is that binary data, which is a rather obscure addition to the language, doesn't break mainstream uses of the language. Disallowing custom properties on typed array objects just because binary data assignments lose object identity is silly. You're already losing object identity and it's already observable. Custom properties have nothing to do with this.

# Brendan Eich (11 years ago)

Filip Pizlo wrote:

So, how big are your non-expanddable typed arrays, and what do they look like? If they're not smaller than 16 bytes in the common case with 32-bit pointers, or 32 bytes in the common case with 64-bit pointers, then there is no performance argument in favor of getting rid of expandable properties.

I like your analysis, it helps greatly to be quantitative and to talk concretely about implementation trade-offs. However, I don't think it proves as much as you assert.

Suppose I want (as IBM did for years, and may still) to implement IEEE 754r decimal in JS, with minimal overhead. I would need 128 bits of flat storage, no variable length, no .buffer or aliasing, and no expandos. Can binary data help me do that? If so, how small can the thing be? I'd like a multiple of 16 bytes, but on 64-bit targets that does not leave enough room for TBLR and we don't really need BLR anyway.

If we can't implement efficient-enough 754r decimal using binary data, that's sad. Not the end of the world, and it doesn't prove a whole lot about anything (perhaps we'll figure out something next year). But the small, fixed-size array case exists (think of Float32Array 4-vectors, homogeneous coordinates). It seems to me you are giving this use-case short shrift.

# Niko Matsakis (11 years ago)

I think Filip is right that given sufficient cleverness extensible properties for typed objects can be implemented efficiently. The real question is what the behavior SHOULD be. As others have pointed out, we are not forced to support extensible properties for web compat reasons.

I also think it is very important and useful to have typed objects be a generalization of typed arrays. I suspect nobody wants an "almost but not quite the same" set of array types. It'd be my preference that (eventually) the "specification" for typed arrays can just be "var Uint16Array = new ArrayType(uint16)", which I believe is currently plausible.

In light of this consideration, that means that adding exensible properties to typed arrays means adding extensible properties to all "typed objects that are arrays" (that is, instances of some type defined by new ArrayType()).

As Dmitry pointed out, extensible properties is only possible for "top-level" objects. I think this results in a surprising and non-composable spec.

The surprising behavior isn't limited to the copying example that Dmitry gave. Another problem is that instances of array types that are found embedded in other structures don't have the full capabilities of "top-level" instances. Without extensible properties, it is true that if I have a function that is given a typed object (of any kind, array or struct) and uses it, I can also provide it with an instance of that same type that is a part of a bigger structure.

For example:

function doSomething(anArray) {
    anArray[0] = anArray[1];
}

// Invoke doSomething with top-level array
var TwoUint8s = new ArrayType(uint8, 2);
doSomething(new TwoUint8s());

// Invoke doSomething with array that is
// embedded within a struct:
var MyStruct = StructType({a: TwoUint8s});
var instance = new MyStruct();
doSomething(instance.a);

But this no longer works if doSomething makes use of extensible properties:

function doSomething(anArray) {
    anArray[0] = anArray[1];
    anArray.foo = anArray.bar;
}

Now the second use case doesn't work.

To me, it seems a shame to trade a simple story ("typed objects let you define the layout and fields of an object, full stop") for a more complex, non-composable one ("...except for extra fields on arrays, which only work some of the time").

Niko

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 11:33 AM, Brendan Eich <brendan at mozilla.com> wrote:

Filip Pizlo wrote:

So, how big are your non-expanddable typed arrays, and what do they look like? If they're not smaller than 16 bytes in the common case with 32-bit pointers, or 32 bytes in the common case with 64-bit pointers, then there is no performance argument in favor of getting rid of expandable properties.

I like your analysis, it helps greatly to be quantitative and to talk concretely about implementation trade-offs. However, I don't think it proves as much as you assert.

Depends on your interpretation of what I'm asserting. ;-) I'm talking about typed array objects - ones that can be pointed to by JavaScript values, and that have a prototype chain, can be aliased, etc.

Suppose I want (as IBM did for years, and may still) to implement IEEE 754r decimal in JS, with minimal overhead. I would need 128 bits of flat storage, no variable length, no .buffer or aliasing, and no expandos. Can binary data help me do that? If so, how small can the thing be? I'd like a multiple of 16 bytes, but on 64-bit targets that does not leave enough room for TBLR and we don't really need BLR anyway.

If we can't implement efficient-enough 754r decimal using binary data, that's sad. Not the end of the world, and it doesn't prove a whole lot about anything (perhaps we'll figure out something next year). But the small, fixed-size array case exists (think of Float32Array 4-vectors, homogeneous coordinates). It seems to me you are giving this use-case short shrift.

I'm not. I care deeply about small arrays. This analysis wasn't merely a thought experiment, it arose from me spending a month trying to figure out how to aggressively reduce the overhead of typed arrays. My original hope was to get down to Java-level overheads and my conclusion was that unless I wanted to severely punish anyone who said .buffer, I'd have to have one more word of overhead than Java (i.e. 16 bytes on 32-bit instead of 12 bytes on 32-bit).

My point is that having custom properties, or not, doesn't change the overhead for the existing typed array spec and hence has no effect on small arrays. The reasons for this include:

  • Typed arrays already have to be objects, and hence have a well-defined behavior on '=='.

  • Typed arrays already have to be able to tell you that they are in fact typed arrays, since JS doesn't have static typing.

  • Typed arrays already have prototypes, and those are observable regardless of expandability. A typed array from one global object will have a different prototype than a typed array from a different global object. Or am I misunderstanding the spec?

  • Typed arrays already have to know about their buffer.

  • Typed arrays already have to know about their offset into the buffer. Or, more likely, they have to have a second pointer that points directly at the base from which they are indexed.

  • Typed arrays already have to know their length.

You're not proposing changing these aspects of typed arrays, right?

The super short message is this: so long as an object obeys object identity on '==' then you can have "free if unused, suboptimal if you use them" custom properties by using a weak map on the side. This is true of typed arrays and it would be true of any other object that does object-style ==. If you allocate such an object and never add a custom property then the weak map will never have an entry for it; but if you put custom properties in the object then the map will have things in it. But with typed arrays you can do even better as my previous message suggests: so long as an object has a seldom-touched field and you're willing to eat an extra indirection or an extra branch on that field, you can have "free if unused, still pretty good if you use them" custom properties by displacing that field. Typed arrays have both of these properties right now and so expandability is a free lunch.

Still find this discussion amusing? Here's the long story is: It is these things that I list above that lead to a 16 byte overhead on 32-bit, and a 32-byte overhead on 64-bit in the best "sane" case. Giving typed array objects expandability doesn't add to this overhead, because two of the fields necessary to implement the above (the type, and the buffer) can be displaced for pointing to property storage. Any imaginable attempt to reduce the overhead incurred by the information - using BBOP (big bag of pages) for the type, using an out-of-line weak map for the buffer or the type, encoding some of the bits inside the pointer to the typed array, etc. - can be also used to eradicate any space overhead you'd need for custom properties, so long as you're on board with the "free if unused, sub-optimal if you use them" philosophy.

So if we did introduce a new type that has lower overheads, for example a new kind of typed arrays - or an entirely new kind of type, say Int64 or FloatDecimal or whatever - then for those new types we can revisit some of the cruft listed above. We can say that they don't have an observable buffer, or that their prototype isn't observable, or that they have string-like == behavior - and then doing those things will indeed reduce space overhead. Certainly if you introduced a new kind of typed arrays that didn't have a .buffer, then you could definitely shave off one pointer. Or if you were willing to say that using .buffer was super expensive (a direction I rejected based on the typed array code I saw) then you would also be able to shave off one pointer. And you'd be able to do this regardless of whether typed arrays allowed for custom properties - you can allow for custom properties for free so long as typed array objects obey object identity on == since if you really wanted them you could just use a side table.

So, to sum up my point: if you already implement typed arrays as they are currently specified, then adding custom properties to them will not increase size or reduce performance. This argument isn't true for all types - for example adding custom properties to strings wouldn't be as easy - and that argument may not be true for any future FloatingDecimal or Int64 type. But it is true for typed arrays right now. My particular way of implementing "free if unused" custom properties is to displace the .buffer pointer but even if you got rid of that then I could still support "free if unused" custom properties so long as object identity worked.

As a final note, I was doing a thought-experiment of how to optimize Int64 in a hypothetical world where the new Int64 type had to behave as if it was implemented totally in JS and it was expandable, just for fun. I came to two conclusions:

  • If a VM will always heap-allocate exactly 64 bits of memory (not more, not less) for each Int64, in a special side-heap without any object header overhead, then you can totally support fully expandable Int64's and you can make the prototype observable and you can even allow it to be changed. You do it roughly as follows: the type of the object (i.e. the fact that it's an Int64) is identified by its place in the heap; the prototype is also identified by the place (as in BBOP, see page 3 of dspace.mit.edu/bitstream/handle/1721.1/6278/AIM-420.pdf?sequence=2); and any custom properties are stored in a weak table on the side. This gives you excellent performance so long as you never assign proto and never add custom properties, and slightly crappy performance if you do either of those things (notably, assigning to proto may require an evacuation or poisoning the type table entry for an entire page). The outcome is that Int64 instances behave like full-fledged objects with zero compromises. If programmers stay away from the crazy then they'll have a great time, but if they wan't to go bananas then the language still allows it but with some performance cost.

  • If the VM wants to go further and create immediate representations of some or all Int64's, similarly to what VMs do for JS small integers today, then the main problem you run into is object identity: does Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation of an Int64 class would say that this is false, since it's likely to allocate a new Int64 each time. But an immediate representation would have no choice but to say true. You can work around this if you say that the VM's implementation of Int64 operations behaves as if the add()/sub()/whatever() methods used a singleton cache. You can still then have custom properties; i.e. you could do Int64(2).foo = 42 and then Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an immediate-int64-to-customproperties map on the side. That's kind of analogous to how you could put a setter on field '2' of Array.prototype and do some really hilarious things.

Anyway, the last three paragraphs are somewhat orthogonal to the discussion of typed arrays since for those it's just so darn easy to do expandability without overhead, but I do wanted to hit on two points:

  • Whether or not something is expandable has got absolutely nothing to do with how much memory it uses. Not one bit of memory is wasted from allowing "free if unused" expandability, and that is a natural outcome so long as objects obey object identity.

  • Don't restrict the language only to make VM hackers' lives easier. VM hackers' lives aren't supposed to be easy. The point of a language is to delight users, not VM hackers.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 12:17 PM, Niko Matsakis <niko at alum.mit.edu> wrote:

I think Filip is right that given sufficient cleverness extensible properties for typed objects can be implemented efficiently. The real question is what the behavior SHOULD be. As others have pointed out, we are not forced to support extensible properties for web compat reasons.

I also think it is very important and useful to have typed objects be a generalization of typed arrays. I suspect nobody wants an "almost but not quite the same" set of array types. It'd be my preference that (eventually) the "specification" for typed arrays can just be "var Uint16Array = new ArrayType(uint16)", which I believe is currently plausible.

In light of this consideration, that means that adding exensible properties to typed arrays means adding extensible properties to all "typed objects that are arrays" (that is, instances of some type defined by new ArrayType()).

As Dmitry pointed out, extensible properties is only possible for "top-level" objects. I think this results in a surprising and non-composable spec.

The surprising behavior isn't limited to the copying example that Dmitry gave. Another problem is that instances of array types that are found embedded in other structures don't have the full capabilities of "top-level" instances. Without extensible properties, it is true that if I have a function that is given a typed object (of any kind, array or struct) and uses it, I can also provide it with an instance of that same type that is a part of a bigger structure.

For example:

function doSomething(anArray) { anArray[0] = anArray[1]; }

// Invoke doSomething with top-level array var TwoUint8s = new ArrayType(uint8, 2); doSomething(new TwoUint8s());

// Invoke doSomething with array that is // embedded within a struct: var MyStruct = StructType({a: TwoUint8s}); var instance = new MyStruct(); doSomething(instance.a);

But this no longer works if doSomething makes use of extensible properties:

function doSomething(anArray) { anArray[0] = anArray[1]; anArray.foo = anArray.bar; }

Now the second use case doesn't work.

To me, it seems a shame to trade a simple story ("typed objects let you define the layout and fields of an object, full stop") for a more complex, non-composable one ("...except for extra fields on arrays, which only work some of the time").

Hi Niko,

The reason why I'm OK with the more complex story is that we already have that story for '=='. To me, named object properties are analogous to being able to identify whether you have the same object or a different object: both are mechanisms that reveal aliasing to the user. Having typed objects that are embedded in other ones already breaks ==.

# Niko Matsakis (11 years ago)

On Wed, Sep 04, 2013 at 12:38:39PM -0700, Filip Pizlo wrote:

The reason why I'm OK with the more complex story is that we already have that story for '=='. To me, named object properties are analogous to being able to identify whether you have the same object or a different object: both are mechanisms that reveal aliasing to the user. Having typed objects that are embedded in other ones already breaks ==.

I'm afraid I don't quite follow you here. The point is not that extensible properties permit the user to observe aliasing: since arrays are mutable, aliasing is observable even without == or extensible properties.

Rather, I am saying that it seems desirable for all typed objects with a particular type to support the same set of operations: but this is not possible if we permit extensible properties on typed arrays, since there will always be a distinction between a "top-level" array (i.e., one that owns its own memory) and a derived array (one that aliases another object).

[Well, I suppose it would be possible to permit all array instances to have extensible properties, whether they are derived or not, but that seems surprising indeed. It would imply that if you did something like:

 var MyArray = new ArrayType(...);
 var MyStruct = new ArrayType({f: MyArray});
 var struct = new MyStruct(...);
 var array1 = struct.f;
 var array2 = struct.f;

then array1 and array2 would have disjoint sets of extensible properties.]

Niko

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 1:00 PM, Niko Matsakis <niko at alum.mit.edu> wrote:

On Wed, Sep 04, 2013 at 12:38:39PM -0700, Filip Pizlo wrote:

The reason why I'm OK with the more complex story is that we already have that story for '=='. To me, named object properties are analogous to being able to identify whether you have the same object or a different object: both are mechanisms that reveal aliasing to the user. Having typed objects that are embedded in other ones already breaks ==.

I'm afraid I don't quite follow you here. The point is not that extensible properties permit the user to observe aliasing: since arrays are mutable, aliasing is observable even without == or extensible properties.

Ah, sorry, I was unclear. My point is that given two typed arrays a and b, both == and custom properties allow you to tell the difference between a and b sharing the same backing data (the kind of aliasing you speak of) and actually being the same object.

== allows you to do this because either a == b evaluates true or it evaluates false. If you allocate a typed array 'a' and then store it into a binary data field and then load from that field later into a variable 'b', then a != b. Hence, you've observed that a and b don't point to the same object.

Likewise, custom named properties would also allow you to make the same observation. If you allocate a typed array 'a', then store a custom field into it ('a.foo = 42'), then store it into a binary data field and later load from it into 'b', then 'b.foo != 42'. Hence, again, you've observed that a and b don't point to the same object even though they are both wrappers for the same underlying array data.

I agree that both of these aspects of binary data are quirky. My observation is that prohibiting custom properties doesn't fix the underlying issue.

Rather, I am saying that it seems desirable for all typed objects with a particular type to support the same set of operations: but this is not possible if we permit extensible properties on typed arrays, since there will always be a distinction between a "top-level" array (i.e., one that owns its own memory) and a derived array (one that aliases another object).

Right but that distinction is already there for ==.

[Well, I suppose it would be possible to permit all array instances to have extensible properties, whether they are derived or not, but that seems surprising indeed. It would imply that if you did something like:

var MyArray = new ArrayType(...);
var MyStruct = new ArrayType({f: MyArray});
var struct = new MyStruct(...);
var array1 = struct.f;
var array2 = struct.f;

then array1 and array2 would have disjoint sets of extensible properties.]

Yes, they would. But even if they didn't, then array1 != array2, which is equally odd.

# Brendan Eich (11 years ago)

Filip Pizlo wrote:

I agree that both of these aspects of binary data are quirky. My observation is that prohibiting custom properties doesn't fix the underlying issue.

[snip]

var MyArray = new ArrayType(...);
var MyStruct = new ArrayType({f: MyArray});
var struct = new MyStruct(...);
var array1 = struct.f;
var array2 = struct.f;

then array1 and array2 would have disjoint sets of extensible properties.]

Yes, they would. But even if they didn't, then array1 != array2, which is equally odd.

Both adds up to two, not one, so the counterargument is odd beats odder, and prohibiting expandos keeps the oddness down to == and nothing else.

I'm not trying to persuade you here, just trying to agree on how to do "oddness accounting". It could be that we're better off with the oddness you prefer, for human factors reasons of some kind.

But "lost expandos" due to loss of identity are an especially nasty kind of bug to find. Is there any use-case here? We've never had a bug report asking us to make SpiderMonkey's typed arrays extensible, AFAIK.

# Brendan Eich (11 years ago)

Filip Pizlo <mailto:fpizlo at apple.com> September 4, 2013 12:34 PM My point is that having custom properties, or not, doesn't change the overhead for the existing typed array spec and hence has no effect on small arrays. The reasons for this include:

  • Typed arrays already have to be objects, and hence have a well-defined behavior on '=='.

  • Typed arrays already have to be able to tell you that they are in fact typed arrays, since JS doesn't have static typing.

  • Typed arrays already have prototypes, and those are observable regardless of expandability. A typed array from one global object will have a different prototype than a typed array from a different global object. Or am I misunderstanding the spec?

  • Typed arrays already have to know about their buffer.

  • Typed arrays already have to know about their offset into the buffer. Or, more likely, they have to have a second pointer that points directly at the base from which they are indexed.

  • Typed arrays already have to know their length.

You're not proposing changing these aspects of typed arrays, right?

Of course not, but for very small fixed length arrays whose .buffer is never accessed, an implementation might optimize harder. It's hard for me to say "no, Filip's analysis shows that's never worthwhile, for all time."

The super short message is this: so long as an object obeys object identity on '==' then you can have "free if unused, suboptimal if you use them" custom properties by using a weak map on the side. This is true of typed arrays and it would be true of any other object that does object-style ==. If you allocate such an object and never add a custom property then the weak map will never have an entry for it; but if you put custom properties in the object then the map will have things in it. But with typed arrays you can do even better as my previous message suggests: so long as an object has a seldom-touched field and you're willing to eat an extra indirection or an extra branch on that field, you can have "free if unused, still pretty good if you use them" custom properties by displacing that field. Typed arrays have both of these properties right now and so expandability is a free lunch.

The last sentence makes a "for-all" assertion I don't think implementations must be constrained by. Small fixed-length arrays whose .buffer is never accessed (which an implementation might be able to prove by type inference) could be optimized harder.

The lack of static types in JS does not mean exactly one implementation representation must serve for all instances of a given JS-level abstraction. We already have strings optimized variously in the top VMs, including Chords or Ropes, dependent strings, different character sets, etc.

Still find this discussion amusing? Here's the long story is: It is these things that I list above that lead to a 16 byte overhead on 32-bit, and a 32-byte overhead on 64-bit in the best "sane" case. Giving typed array objects expandability doesn't add to this overhead, because two of the fields necessary to implement the above (the type, and the buffer) can be displaced for pointing to property storage. Any imaginable attempt to reduce the overhead incurred by the information - using BBOP (big bag of pages) for the type, using an out-of-line weak map for the buffer or the type, encoding some of the bits inside the pointer to the typed array, etc. - can be also used to eradicate any space overhead you'd need for custom properties, so long as you're on board with the "free if unused, sub-optimal if you use them" philosophy.

For something like decimal, it matters whether there's an empty side table and large-N decimal instances of total size NS, vs. N(S+K) for some constant K we could eliminate by specializing harder. Even better if we agree that decimal instances should be non-extensible (and have value not reference semantics -- more below).

  • If the VM wants to go further and create immediate representations of some or all Int64's, similarly to what VMs do for JS small integers today, then the main problem you run into is object identity: does Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation of an Int64 class would say that this is false, since it's likely to allocate a new Int64 each time. But an immediate representation would have no choice but to say true. You can work around this if you say that the VM's implementation of Int64 operations behaves /as if/ the add()/sub()/whatever() methods used a singleton cache. You can still then have custom properties; i.e. you could do Int64(2).foo = 42 and then Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an immediate-int64-to-customproperties map on the side. That's kind of analogous to how you could put a setter on field '2' of Array.prototype and do some really hilarious things.

The value objects proposal for ES7 is live, I'm championing it. It does not use (double-dispatch for dyadic) operators as methods. It does not use extensible objects.

strawman:value_objects, www.slideshare.net/BrendanEich/value-objects

Warning: both are slightly out of date, I'll be updating the strawman over the next week.

With value objects, TC39 has definitely favored something that I think you oppose, namely extending JS to have (more) objects with value not reference semantics, which requires non-extensibility.

If I have followed your messages correctly, this is because you think non-extensibility is a rare case that should not proliferate. But with ES5 Object.preventExtensions, etc., the horse is out of the barn.

At a deeper level, the primitives wired into the language, boolean number string -- in particular number when considering int64, bignum, etc. -- can be rationalized as value objects provided we make typeof work as people want (and work so as to uphold a == b && typeof a == typeof b <=> a === b).

This seems more winning in how it unifies concepts and empowers users to make more value objects, than the alternative of saying "the primitives are legacy, everything else has reference semantics" and turning a blind eye, or directing harsh and probably ineffective deprecating words, to Object.preventExtensions.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 3:09 PM, Brendan Eich <brendan at mozilla.com> wrote:

Filip Pizlo <mailto:fpizlo at apple.com> September 4, 2013 12:34 PM My point is that having custom properties, or not, doesn't change the overhead for the existing typed array spec and hence has no effect on small arrays. The reasons for this include:

  • Typed arrays already have to be objects, and hence have a well-defined behavior on '=='.

  • Typed arrays already have to be able to tell you that they are in fact typed arrays, since JS doesn't have static typing.

  • Typed arrays already have prototypes, and those are observable regardless of expandability. A typed array from one global object will have a different prototype than a typed array from a different global object. Or am I misunderstanding the spec?

  • Typed arrays already have to know about their buffer.

  • Typed arrays already have to know about their offset into the buffer. Or, more likely, they have to have a second pointer that points directly at the base from which they are indexed.

  • Typed arrays already have to know their length.

You're not proposing changing these aspects of typed arrays, right?

Of course not, but for very small fixed length arrays whose .buffer is never accessed, an implementation might optimize harder.

As I said, of course you can do this, and one way you could "try harder" is to put the buffer pointer in a side table. The side table maps array object pointers to their buffers, and you only make an entry in this table if .buffer is mentioned.

But if we believe that this is a sensible thing for a VM to do - and of course it is! - then the same thing can be done for the custom property storage pointer.

It's hard for me to say "no, Filip's analysis shows that's never worthwhile, for all time."

The super short message is this: so long as an object obeys object identity on '==' then you can have "free if unused, suboptimal if you use them" custom properties by using a weak map on the side. This is true of typed arrays and it would be true of any other object that does object-style ==. If you allocate such an object and never add a custom property then the weak map will never have an entry for it; but if you put custom properties in the object then the map will have things in it. But with typed arrays you can do even better as my previous message suggests: so long as an object has a seldom-touched field and you're willing to eat an extra indirection or an extra branch on that field, you can have "free if unused, still pretty good if you use them" custom properties by displacing that field. Typed arrays have both of these properties right now and so expandability is a free lunch.

The last sentence makes a "for-all" assertion I don't think implementations must be constrained by.

How so? It is true that some VM implementations will be better than others. But ultimately every VM can implement every optimization that every other VM has; in fact my impression is that this is exactly what is happening as we speak.

So, it doesn't make much sense to make language design decisions because it might make some implementor's life easier right now. If you could argue that something will never be efficient if we add feature X, then that might be an interesting argument. But as soon as we identify one sensible optimization strategy for making something free, I would tend to think that this is sufficient to conclude that the feature is free and there is no need to constrain it. If we don't do this then we risk adding cargo-cult performance features that rapidly become obsolete.

Small fixed-length arrays whose .buffer is never accessed (which an implementation might be able to prove by type inference) could be optimized harder.

And my point is that if you do so, then the same technique can be trivially applied to the custom property storage pointer.

The lack of static types in JS does not mean exactly one implementation representation must serve for all instances of a given JS-level abstraction. We already have strings optimized variously in the top VMs, including Chords or Ropes, dependent strings, different character sets, etc.

Still find this discussion amusing? Here's the long story is: It is these things that I list above that lead to a 16 byte overhead on 32-bit, and a 32-byte overhead on 64-bit in the best "sane" case. Giving typed array objects expandability doesn't add to this overhead, because two of the fields necessary to implement the above (the type, and the buffer) can be displaced for pointing to property storage. Any imaginable attempt to reduce the overhead incurred by the information - using BBOP (big bag of pages) for the type, using an out-of-line weak map for the buffer or the type, encoding some of the bits inside the pointer to the typed array, etc. - can be also used to eradicate any space overhead you'd need for custom properties, so long as you're on board with the "free if unused, sub-optimal if you use them" philosophy.

For something like decimal, it matters whether there's an empty side table and large-N decimal instances of total size NS, vs. N(S+K) for some constant K we could eliminate by specializing harder. Even better if we agree that decimal instances should be non-extensible (and have value not reference semantics -- more below).

With a side table, the constant K = 0 even if you have custom properties. The table will only have an entry for those instances that had custom properties.

  • If the VM wants to go further and create immediate representations of some or all Int64's, similarly to what VMs do for JS small integers today, then the main problem you run into is object identity: does Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation of an Int64 class would say that this is false, since it's likely to allocate a new Int64 each time. But an immediate representation would have no choice but to say true. You can work around this if you say that the VM's implementation of Int64 operations behaves /as if/ the add()/sub()/whatever() methods used a singleton cache. You can still then have custom properties; i.e. you could do Int64(2).foo = 42 and then Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an immediate-int64-to-customproperties map on the side. That's kind of analogous to how you could put a setter on field '2' of Array.prototype and do some really hilarious things.

The value objects proposal for ES7 is live, I'm championing it. It does not use (double-dispatch for dyadic) operators as methods. It does not use extensible objects.

strawman:value_objects, www.slideshare.net/BrendanEich/value-objects

Warning: both are slightly out of date, I'll be updating the strawman over the next week.

Thanks for the links! To clarify, I'm not trying to make a counterproposal - the above was nothing more than a fun thought experiment and I shared it to motivate why I think that custom properties are free.

My understanding is that you are still arguing that custom properties are not free, and that they incur some tangible cost in terms of space and/or time. I'm just trying to show you why they don't if you do the same optimizations for them that have become acceptable for a lot of other JS corners. Unless you think that ES should have an "ease of implementation" bar for features. I wouldn't necessarily mind that, but my impression is that this is not the case.

With value objects, TC39 has definitely favored something that I think you oppose, namely extending JS to have (more) objects with value not reference semantics, which requires non-extensibility.

Indeed.

If I have followed your messages correctly, this is because you think non-extensibility is a rare case that should not proliferate.

I have two points here:

  • Typed arrays already have so much observable objectyness that making then non-extensible feels arbitrary; this is true regardless of the prevalence, or lack thereof, of non-extensibility.

  • At the same time, I do think that non-extensibiltiy is a rare case and I don't like it.

But with ES5 Object.preventExtensions, etc., the horse is out of the barn.

It's there and we have to support it, and the fact that you can do preventExtensions() to an object is a good thing. That doesn't mean it should become the cornerstone for every new feature. If a user wants to preventExtensions() on their object, then that's totally cool - and I'm not arguing that it isn't.

The argument I'm making is a different one: should an object be non-expandable by default?

I keep hearing arguments that this somehow makes typed arrays more efficient. That's like arguing that there exists a C compiler, somewhere, that becomes more efficient if you label your variables as 'register'. It's true that if you're missing the well-known optimization of register allocation then yes, 'register' is an optimization. Likewise, if you're missing the well-known object model optimizations like pointer displacement, BBOP's, or other kinds of side tables, then forcing objects to be non-extensible is also an optimization. That doesn't mean that we should bake it into the language. VM hackers can just implement these well-known optimizations and just deal with it.

At a deeper level, the primitives wired into the language, boolean number string -- in particular number when considering int64, bignum, etc. -- can be rationalized as value objects provided we make typeof work as people want (and work so as to uphold a == b && typeof a == typeof b <=> a === b).

I think making int64/bignum be primitives is fine. My only point is that whether or not you make them expandable has got nothing to do with how much memory they use.

This seems more winning in how it unifies concepts and empowers users to make more value objects, than the alternative of saying "the primitives are legacy, everything else has reference semantics" and turning a blind eye, or directing harsh and probably ineffective deprecating words, to Object.preventExtensions.

Well this is all subjective. Objects being expandable by default is a unifying concept. The only thing that expandability of typed arrays appears to change is the interaction with binary data - but that isn't exactly a value object system as much as it is a playing-with-bits system. I'm not sure that having oddities there changes much.

# Oliver Hunt (11 years ago)

On Sep 4, 2013, at 4:15 PM, Filip Pizlo <fpizlo at apple.com> wrote:

On Sep 4, 2013, at 3:09 PM, Brendan Eich <brendan at mozilla.com> wrote:

<snip solo much text :D >

But with ES5 Object.preventExtensions, etc., the horse is out of the barn.

It's there and we have to support it, and the fact that you can do preventExtensions() to an object is a good thing. That doesn't mean it should become the cornerstone for every new feature. If a user wants to preventExtensions() on their object, then that's totally cool - and I'm not arguing that it isn't.

The argument I'm making is a different one: should an object be non-expandable by default?

Actually, here's a very good example: Why do Maps and Sets allow expandos?

  • They are logically buckets so an expando properties seem unnecessary
  • We have seen in the past that people get really confused about property access -- see the enumerable "associative array" articles that do "new Array()" to get there magical associative array. For (probably) common cases of string and numeric properties:
    • someMap["foo"]="bar" and someMap["foo"]; vs.
    • someMap.set("foo", "bar") and someMap.get("foo")

are sufficiently close to "the same" that developers will do this, and think that they're using a Map.

So should Map be inextensible by default? The argument against supporting expandos on a typed array seems even stronger for these collection types.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 2:41 PM, Brendan Eich <brendan at mozilla.com> wrote:

Filip Pizlo wrote:

I agree that both of these aspects of binary data are quirky. My observation is that prohibiting custom properties doesn't fix the underlying issue.

[snip]

var MyArray = new ArrayType(...); var MyStruct = new ArrayType({f: MyArray}); var struct = new MyStruct(...); var array1 = struct.f; var array2 = struct.f;

then array1 and array2 would have disjoint sets of extensible properties.]

Yes, they would. But even if they didn't, then array1 != array2, which is equally odd.

Both adds up to two, not one, so the counterargument is odd beats odder, and prohibiting expandos keeps the oddness down to == and nothing else.

What about being consistently odd? To me, 'struct.f' means allocating a new [sic] array buffer view object. That new object thus has all of the features you'd expect from a new object: it won't have custom properties, it will have some default prototype, and it will not be == to any other object. Hence if you say "struct.f.foo = 42", then "struct.f.foo" will subsequently return undefined. No big deal - it was a new object.

I'm not trying to persuade you here, just trying to agree on how to do "oddness accounting". It could be that we're better off with the oddness you prefer, for human factors reasons of some kind.

I actually think that this simply isn't going to matter. Binary data is there for hacking with bits. Whether a struct.f, which is defined by the user to be an array, is expandable or not isn't going to be a big deal to most people.

On the other hand, empowering users to be able to carry around typed arrays with some extra meta-data could be useful to people.

But "lost expandos" due to loss of identity are an especially nasty kind of bug to find.

I'm actually curious - are you aware of such bugs, and what do they actually look like? To me this is analogous to the question of whether an API returns to you the same object you passed in earlier, or a new object that is a copy - and my vague recollection of the various APIs and SDKs that I've used over the years is that whenever I see such issues, I make a note of them but never find myself having to think very hard about them. And they rarely lead to interesting bugs.

Is there any use-case here? We've never had a bug report asking us to make SpiderMonkey's typed arrays extensible, AFAIK.

I was the one who brought up the use case. ;-) Say I want a matrix. I like saying:

function makeMatrix(rows, cols) { var result = new Float32Array(rows * cols); result.rows = rows; result.cols = cols; return result; }

I realize this is goofy - I could have created a wrapper object around the Float32Array. But that requires more code, and I've come to enjoy doing this kind of goofiness in scripting languages.

# Brendan Eich (11 years ago)

Filip Pizlo wrote:

Typed arrays have both of these properties right now and so expandability is a free lunch.

The last sentence makes a "for-all" assertion I don't think implementations must be constrained by.

How so? It is true that some VM implementations will be better than others. But ultimately every VM can implement every optimization that every other VM has; in fact my impression is that this is exactly what is happening as we speak.

My "for-all" referred to all typed arrays across all VMs, not just all VMs.

Also just as a point of fact (something "done", the Latin root means "deed"), I do not see the same optimizations being used in all VMs. For example, SpiderMonkey's TI (written up here: http:// rfrn.org/~shu/drafts/ti.pdf‎ [PLDI 2012]) is not being used elsewhere AFAIK -- please correct me if I'm mistaken.

So, it doesn't make much sense to make language design decisions because it might make some implementor's life easier right now. If you could argue that something will /never/ be efficient if we add feature X, then that might be an interesting argument. But as soon as we identify one sensible optimization strategy for making something free, I would tend to think that this is sufficient to conclude that the feature is free and there is no need to constrain it. If we don't do this then we risk adding cargo-cult performance features that rapidly become obsolete.

I agree that's a risk. I'm also with Niko in wanting to argue about what the semantics should be without appealing to performance arguments.

However, I still think you are verging on promising a free lunch. All methods in C++ cannot affordably be virtual. Expandos in JS cost. At fine enough grain, even pretty-well-predicted branches cost. Nothing is free-enough to discount forever in my bitter and long experience :-P.

The lack of static types in JS does not mean exactly one implementation representation must serve for all instances of a given JS-level abstraction. We already have strings optimized variously in the top VMs, including Chords or Ropes, dependent strings, different character sets, etc.

Still find this discussion amusing? Here's the long story is: It is these things that I list above that lead to a 16 byte overhead on 32-bit, and a 32-byte overhead on 64-bit in the best "sane" case. Giving typed array objects expandability doesn't add to this overhead, because two of the fields necessary to implement the above (the type, and the buffer) can be displaced for pointing to property storage. Any imaginable attempt to reduce the overhead incurred by the information - using BBOP (big bag of pages) for the type, using an out-of-line weak map for the buffer or the type, encoding some of the bits inside the pointer to the typed array, etc. - can be also used to eradicate any space overhead you'd need for custom properties, so long as you're on board with the "free if unused, sub-optimal if you use them" philosophy.

For something like decimal, it matters whether there's an empty side table and large-N decimal instances of total size NS, vs. N(S+K) for some constant K we could eliminate by specializing harder. Even better if we agree that decimal instances should be non-extensible (and have value not reference semantics -- more below).

With a side table, the constant K = 0 even if you have custom properties. The table will only have an entry for those instances that had custom properties.

I know, that's why I was attacking the non-side-table approach.

But the side table has its own down-side trade-offs: GC complexity, even costlier indirection, and strictly greater implementation complexity. If one could implement without having to mess with this K ?= 0 design decision and hassle with packing or else using a side-table, one's VM would be smaller, simpler, less buggy -- all else equal.

Now you may say that I'm betraying my hero Mr. Spock, whom I have invoked to argue that implementors should sacrifice so the mass of JS users can live long and prosper.

And you'd have me dead to rights -- if I thought JS users wanted expandos on binary data, that the lack of expandos there was a problem akin to the whole starship being blown up. But I do not believe that's the case.

If users don't care, then implementors should get a break and VMs should be simpler, ceteris paribus.

  • If the VM wants to go further and create immediate representations of some or all Int64's, similarly to what VMs do for JS small integers today, then the main problem you run into is object identity: does Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation of an Int64 class would say that this is false, since it's likely to allocate a new Int64 each time. But an immediate representation would have no choice but to say true. You can work around this if you say that the VM's implementation of Int64 operations behaves /as if/ the add()/sub()/whatever() methods used a singleton cache. You can still then have custom properties; i.e. you could do Int64(2).foo = 42 and then Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an immediate-int64-to-customproperties map on the side. That's kind of analogous to how you could put a setter on field '2' of Array.prototype and do some really hilarious things.

The value objects proposal for ES7 is live, I'm championing it. It does not use (double-dispatch for dyadic) operators as methods. It does not use extensible objects.

strawman:value_objects, www.slideshare.net/BrendanEich/value-objects

Warning: both are slightly out of date, I'll be updating the strawman over the next week.

Thanks for the links! To clarify, I'm not trying to make a counterproposal - the above was nothing more than a fun thought experiment and I shared it to motivate why I think that custom properties are free.

My understanding is that you are still arguing that custom properties are not free, and that they incur some tangible cost in terms of space and/or time. I'm just trying to show you why they don't if you do the same optimizations for them that have become acceptable for a lot of other JS corners. Unless you think that ES should have an "ease of implementation" bar for features. I wouldn't necessarily mind that, but my impression is that this is not the case.

I do think implementor ease, or really implementation simplicity, should be a concern. It's secondary, per Spock's Kobayashi Maru solution, to the needs of the many JS users. But it's not nothing. Part of the impetus for Dart, I'm told, is the complexity of V8 required by JS-as-it-is. Whatever the case there, standardized JS extensions should not add too much complexity if we can help it.

I'll lay off performance concerns but you'll still see me, like Ahab lashed to the white whale, beckoning against free lunch arguments or anything near them :-P.

With value objects, TC39 has definitely favored something that I think you oppose, namely extending JS to have (more) objects with value not reference semantics, which requires non-extensibility.

Indeed.

If I have followed your messages correctly, this is because you think non-extensibility is a rare case that should not proliferate.

I have two points here:

  • Typed arrays already have so much observable objectyness that making then non-extensible feels arbitrary; this is true regardless of the prevalence, or lack thereof, of non-extensibility.

Ok, I acknowledge this point.

And yet SpiderMonkey had native typed arrays from the get-go, non-extensible -- we didn't use WebIDL. So the interoperable intersection semantics developers can count on does not include extensibility. As Mark says, this allows us to standardize either way, so we need arguments that don't appeal to "feelings".

  • At the same time, I do think that non-extensibiltiy is a rare case and I don't like it.

I can tell ;-). Feelings are important but to decide on a spec we will need stronger reasons.

But with ES5 Object.preventExtensions, etc., the horse is out of the barn.

It's there and we have to support it, and the fact that you can do preventExtensions() to an object is a good thing. That doesn't mean it should become the cornerstone for every new feature. If a user wants to preventExtensions() on their object, then that's totally cool - and I'm not arguing that it isn't.

The argument I'm making is a different one: should an object be non-expandable by default?

I keep hearing arguments that this somehow makes typed arrays more efficient. That's like arguing that there exists a C compiler, somewhere, that becomes more efficient if you label your variables as 'register'.

I remember when that indeed mattered.

It's true that if you're missing the well-known optimization of register allocation then yes, 'register' is an optimization. Likewise, if you're missing the well-known object model optimizations like pointer displacement, BBOP's, or other kinds of side tables, then forcing objects to be non-extensible is also an optimization. That doesn't mean that we should bake it into the language. VM hackers can just implement these well-known optimizations and just deal with it.

Ok, let's let the performance argument rest. You can be Ishmael and live. I'm Ahab and I still stab at such nearly-free-lunch, "sufficiently smart compiler" claims :-).

At a deeper level, the primitives wired into the language, boolean number string -- in particular number when considering int64, bignum, etc. -- can be rationalized as value objects provided we make typeof work as people want (and work so as to uphold a == b && typeof a == typeof b <=> a === b).

I think making int64/bignum be primitives is fine. My only point is that whether or not you make them expandable has got nothing to do with how much memory they use.

This seems more winning in how it unifies concepts and empowers users to make more value objects, than the alternative of saying "the primitives are legacy, everything else has reference semantics" and turning a blind eye, or directing harsh and probably ineffective deprecating words, to Object.preventExtensions.

Well this is all subjective. Objects being expandable by default is a unifying concept.

It does not unify number, boolean, string.

What's not subjective is that we have two concepts in JS today, one (ignoring null and undefined) for primitive AKA value types, the other for reference types (objects). I see a way to extend object as a concept to subsume value types, although of course unity comes at the price of complexity for object. But non-extensibility is a piece of complexity already added to object as a concept by ES5.

Irreducible complexity here, and perhaps "subjective" or (I prefer) "aesthetic" judgment is the only way to pick.

The only thing that expandability of typed arrays appears to change is the interaction with binary data - but that isn't exactly a value object system as much as it is a playing-with-bits system. I'm not sure that having oddities there changes much.

Sure, let's get back to binary data (I brought up value objects because you brought up int64).

Interior binary data objects will be cons'ed up upon extraction, so distinguishable by == returning false and by lack of expando preservation. Niko, Dmitry, and others take this as a sign that expandos should not be allowed, leaving only == returning false among same-named extractions as an oddity. And they further conclude that expandos should not be allowed on any binary data object (whether interior extracted, or not).

You argue on the contrary that JS objects in general can be extended with expandos, so why restrict binary data objects, even interior ones that are extracted? Let each such extracted interior object be != with all other same-named extractions, and let each have expandos assigned that (vacuously) won't be preserved on next extraction.

I hope I have stated positions accurately. If so I'll tag out of the ring, in hopes of someone else bringing new arguments to bear.

# Brendan Eich (11 years ago)

Filip Pizlo <mailto:fpizlo at apple.com> September 4, 2013 4:45 PM

What about being consistently odd? To me, 'struct.f' means allocating a new [sic] array buffer view object. That new object thus has all of the features you'd expect from a new object: it won't have custom properties, it will have some default prototype, and it will not be == to any other object. Hence if you say "struct.f.foo = 42", then "struct.f.foo" will subsequently return undefined. No big deal - it was a new object.

In WebIDL, IIRC, this is considered bad style. You are supposed to use a method, not an IDL attribute (getter), when returning a fresh object each time. Anne can vouch, cc'ing him.

I'm not saying we must match WebIDL good style -- doing so would mean binary data does not support interior extractions via struct.foo.bar -- rather something awful such as struct.foo().bar, which won't fly. But it seems worth considering that the "odd" or "bad style" design decision is upstream of our petty difference over expandos.

But "lost expandos" due to loss of identity are an especially nasty kind of bug to find.

I'm actually curious - are you aware of such bugs, and what do they actually look like?

Search for "disappearing expando" on the web.

# Niko Matsakis (11 years ago)

On Wed, Sep 04, 2013 at 02:41:24PM -0700, Brendan Eich wrote:

Both adds up to two, not one, so the counterargument is odd beats odder, and prohibiting expandos keeps the oddness down to == and nothing else.

I just want to agree with this. I think the behavior of == is unfortunate, but much more of a corner case than expando properties disappearing and reappearing willy nilly as users switch between otherwise identical views on the same data.

I feel like == can be explained by the fact that these are distinct objects pointing at the same underyling buffer. People get aliasing. But the fact that expando properties would live on the wrapper and not on the underlying data is quite surprising and counterintuitive -- some of the data (the elements) are aliased, but others are not.

Niko

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 5:25 PM, Brendan Eich <brendan at mozilla.com> wrote:

Filip Pizlo wrote:

Typed arrays have both of these properties right now and so expandability is a free lunch.

The last sentence makes a "for-all" assertion I don't think implementations must be constrained by.

How so? It is true that some VM implementations will be better than others. But ultimately every VM can implement every optimization that every other VM has; in fact my impression is that this is exactly what is happening as we speak.

My "for-all" referred to all typed arrays across all VMs, not just all VMs.

Also just as a point of fact (something "done", the Latin root means "deed"), I do not see the same optimizations being used in all VMs. For example, SpiderMonkey's TI (written up here: http:// rfrn.org/~shu/drafts/ti.pdf‎ [PLDI 2012]) is not being used elsewhere AFAIK -- please correct me if I'm mistaken.

Interesting point. Equivalent optimizations are being done. Other VMs also infer types one way or another. And I'd argue that my way of inferring types is the best - it incurs smaller overheads for start-up while achieving more precise results. (Of course I must say that - I stand by my stuff, heh.) That being said, I do think that FF's TI is really cool and loved reading that paper.

It's kind of like in JVMs, all of the big-league ones did speculative inlining - but they do it in radically different ways and rely on different kinds of feedback and if you go to a conference where JVM hackers show up, they will argue about which is best. I have fond memories of Sun vs. IBM vs. Oracle shouting matches about how you do deoptimization, whether you do deoptimization at all, and what you need to analyze and prove things about the class hierarchy. That doesn't change the basics: they all do speculative inlining and it performs sort of the same in the end.

I suspect that the same thing is becoming true of typed arrays, regardless of whether they are extensible or not. I guess that when I said "every optimization that every other VM has" I didn't mean literally using the same exact algorithm - just performing optimizations that achieve equivalent results.

So, it doesn't make much sense to make language design decisions because it might make some implementor's life easier right now. If you could argue that something will /never/ be efficient if we add feature X, then that might be an interesting argument. But as soon as we identify one sensible optimization strategy for making something free, I would tend to think that this is sufficient to conclude that the feature is free and there is no need to constrain it. If we don't do this then we risk adding cargo-cult performance features that rapidly become obsolete.

I agree that's a risk. I'm also with Niko in wanting to argue about what the semantics should be without appealing to performance arguments.

Right! I guess my first order argument is that performance isn't an argument in favor of non-expandability.

However, I still think you are verging on promising a free lunch. All methods in C++ cannot affordably be virtual. Expandos in JS cost. At fine enough grain, even pretty-well-predicted branches cost. Nothing is free-enough to discount forever in my bitter and long experience :-P.

I am promising a free lunch! Virtual methods in C++ are only expensive because C++ still doesn't have feedback-driven optimization. JVMs make them free in Java. And they are free. Period. There is no upside to marking a method final in Java. I am arguing that expandos are similar.

The lack of static types in JS does not mean exactly one implementation representation must serve for all instances of a given JS-level abstraction. We already have strings optimized variously in the top VMs, including Chords or Ropes, dependent strings, different character sets, etc.

Still find this discussion amusing? Here's the long story is: It is these things that I list above that lead to a 16 byte overhead on 32-bit, and a 32-byte overhead on 64-bit in the best "sane" case. Giving typed array objects expandability doesn't add to this overhead, because two of the fields necessary to implement the above (the type, and the buffer) can be displaced for pointing to property storage. Any imaginable attempt to reduce the overhead incurred by the information - using BBOP (big bag of pages) for the type, using an out-of-line weak map for the buffer or the type, encoding some of the bits inside the pointer to the typed array, etc. - can be also used to eradicate any space overhead you'd need for custom properties, so long as you're on board with the "free if unused, sub-optimal if you use them" philosophy.

For something like decimal, it matters whether there's an empty side table and large-N decimal instances of total size NS, vs. N(S+K) for some constant K we could eliminate by specializing harder. Even better if we agree that decimal instances should be non-extensible (and have value not reference semantics -- more below).

With a side table, the constant K = 0 even if you have custom properties. The table will only have an entry for those instances that had custom properties.

I know, that's why I was attacking the non-side-table approach.

But the side table has its own down-side trade-offs: GC complexity, even costlier indirection, and strictly greater implementation complexity. If one could implement without having to mess with this K ?= 0 design decision and hassle with packing or else using a side-table, one's VM would be smaller, simpler, less buggy -- all else equal.

Meh, I'm just reusing the GC complexity that the DOM already introduces.

Now you may say that I'm betraying my hero Mr. Spock, whom I have invoked to argue that implementors should sacrifice so the mass of JS users can live long and prosper.

Yes, you are. ;-)

And you'd have me dead to rights -- if I thought JS users wanted expandos on binary data, that the lack of expandos there was a problem akin to the whole starship being blown up. But I do not believe that's the case.

If users don't care, then implementors should get a break and VMs should be simpler, ceteris paribus.

Fair enough.

  • If the VM wants to go further and create immediate representations of some or all Int64's, similarly to what VMs do for JS small integers today, then the main problem you run into is object identity: does Int64(1).add(Int64(1)) == Int64(1).add(Int64(1))? A naive JS implementation of an Int64 class would say that this is false, since it's likely to allocate a new Int64 each time. But an immediate representation would have no choice but to say true. You can work around this if you say that the VM's implementation of Int64 operations behaves /as if/ the add()/sub()/whatever() methods used a singleton cache. You can still then have custom properties; i.e. you could do Int64(2).foo = 42 and then Int64(1).add(Int64(1)).foo will return 42, since the VM can keep an immediate-int64-to-customproperties map on the side. That's kind of analogous to how you could put a setter on field '2' of Array.prototype and do some really hilarious things.

The value objects proposal for ES7 is live, I'm championing it. It does not use (double-dispatch for dyadic) operators as methods. It does not use extensible objects.

strawman:value_objects, www.slideshare.net/BrendanEich/value-objects

Warning: both are slightly out of date, I'll be updating the strawman over the next week.

Thanks for the links! To clarify, I'm not trying to make a counterproposal - the above was nothing more than a fun thought experiment and I shared it to motivate why I think that custom properties are free.

My understanding is that you are still arguing that custom properties are not free, and that they incur some tangible cost in terms of space and/or time. I'm just trying to show you why they don't if you do the same optimizations for them that have become acceptable for a lot of other JS corners. Unless you think that ES should have an "ease of implementation" bar for features. I wouldn't necessarily mind that, but my impression is that this is not the case.

I do think implementor ease, or really implementation simplicity, should be a concern. It's secondary, per Spock's Kobayashi Maru solution, to the needs of the many JS users. But it's not nothing. Part of the impetus for Dart, I'm told, is the complexity of V8 required by JS-as-it-is. Whatever the case there, standardized JS extensions should not add too much complexity if we can help it.

I'll lay off performance concerns but you'll still see me, like Ahab lashed to the white whale, beckoning against free lunch arguments or anything near them :-P.

My job is to give people a free lunch in the performance department. So I live by free lunch arguments.

With value objects, TC39 has definitely favored something that I think you oppose, namely extending JS to have (more) objects with value not reference semantics, which requires non-extensibility.

Indeed.

If I have followed your messages correctly, this is because you think non-extensibility is a rare case that should not proliferate.

I have two points here:

  • Typed arrays already have so much observable objectyness that making then non-extensible feels arbitrary; this is true regardless of the prevalence, or lack thereof, of non-extensibility.

Ok, I acknowledge this point.

And yet SpiderMonkey had native typed arrays from the get-go, non-extensible -- we didn't use WebIDL. So the interoperable intersection semantics developers can count on does not include extensibility. As Mark says, this allows us to standardize either way, so we need arguments that don't appeal to "feelings".

This is a good point.

  • At the same time, I do think that non-extensibiltiy is a rare case and I don't like it.

I can tell ;-). Feelings are important but to decide on a spec we will need stronger reasons.

I agree. I'm assuming that in the grand scheme of things, specs improve when people articulate gut feelings and we reach something in the middle.

But with ES5 Object.preventExtensions, etc., the horse is out of the barn.

It's there and we have to support it, and the fact that you can do preventExtensions() to an object is a good thing. That doesn't mean it should become the cornerstone for every new feature. If a user wants to preventExtensions() on their object, then that's totally cool - and I'm not arguing that it isn't.

The argument I'm making is a different one: should an object be non-expandable by default?

I keep hearing arguments that this somehow makes typed arrays more efficient. That's like arguing that there exists a C compiler, somewhere, that becomes more efficient if you label your variables as 'register'.

I remember when that indeed mattered.

It's true that if you're missing the well-known optimization of register allocation then yes, 'register' is an optimization. Likewise, if you're missing the well-known object model optimizations like pointer displacement, BBOP's, or other kinds of side tables, then forcing objects to be non-extensible is also an optimization. That doesn't mean that we should bake it into the language. VM hackers can just implement these well-known optimizations and just deal with it.

Ok, let's let the performance argument rest. You can be Ishmael and live. I'm Ahab and I still stab at such nearly-free-lunch, "sufficiently smart compiler" claims :-).

At a deeper level, the primitives wired into the language, boolean number string -- in particular number when considering int64, bignum, etc. -- can be rationalized as value objects provided we make typeof work as people want (and work so as to uphold a == b && typeof a == typeof b <=> a === b).

I think making int64/bignum be primitives is fine. My only point is that whether or not you make them expandable has got nothing to do with how much memory they use.

This seems more winning in how it unifies concepts and empowers users to make more value objects, than the alternative of saying "the primitives are legacy, everything else has reference semantics" and turning a blind eye, or directing harsh and probably ineffective deprecating words, to Object.preventExtensions.

Well this is all subjective. Objects being expandable by default is a unifying concept.

It does not unify number, boolean, string.

True.

What's not subjective is that we have two concepts in JS today, one (ignoring null and undefined) for primitive AKA value types, the other for reference types (objects). I see a way to extend object as a concept to subsume value types, although of course unity comes at the price of complexity for object. But non-extensibility is a piece of complexity already added to object as a concept by ES5.

Irreducible complexity here, and perhaps "subjective" or (I prefer) "aesthetic" judgment is the only way to pick.

Is it clear that we can't have a better story for value types? I just don't think that non-extensibility is sufficient.

OK so lets back up. Do you believe that making an object non-extensible is sufficient to make it a "value type"? I don't. You probably need some other stuff.

This is where I return to the objectyness point: typed arrays are already spec'd to have a bunch of heavy reference-to-object behavior. So making then expandable is no big deal. And making then non-expandable means that we'll now live in a weirdo world where we have four different concepts of what it means to be a value:

A) Full blown reference objects that you can do weird things to, like add properties and change proto, etc. You can also make one non-extensible at your discretion, which fits into the bat-poop crazy "you can do anything" philosophy of full blown objects. And that's great - that's the heart of the language, and I happen to enjoy it.

B) Object types that are always non-extensible but otherwise still objecty - they have a prototype that is observable, they reveal their identity via ==, and you can actually inject stuff into them by modifying the appropriate Object.prototype.

C) Values with whatever value type semantics we come up with in the future.

D) Primitives.

Now, I hope that we could get C and D to be as close as possible to each other. But that still leaves three different behaviors. This introduces a learning curve. That's why (B) offends me. It's subtly different from (A) and clearly different from either (C) or (D).

Now, we actually also have a totally alternate behavior, used by binary data. And my argument there is that I wouldn't get too offended by binary data acting weird, because the very notion of exposing binary data is weird to begin with. I expect it to be used only for special graphicsy stuff and not for general-purpose "value types" for normal JS programs. So it's OK to me if binary data is both weird and inconsistent with everything else. And no, I still don't view "typed arrays" as being part of binary data - it already appears to be the case that typed arrays have different buffer behavior to the struct types. So they're just different. And that's fine.

The only thing that expandability of typed arrays appears to change is the interaction with binary data - but that isn't exactly a value object system as much as it is a playing-with-bits system. I'm not sure that having oddities there changes much.

Sure, let's get back to binary data (I brought up value objects because you brought up int64).

Interior binary data objects will be cons'ed up upon extraction, so distinguishable by == returning false and by lack of expando preservation. Niko, Dmitry, and others take this as a sign that expandos should not be allowed, leaving only == returning false among same-named extractions as an oddity. And they further conclude that expandos should not be allowed on any binary data object (whether interior extracted, or not).

You argue on the contrary that JS objects in general can be extended with expandos, so why restrict binary data objects, even interior ones that are extracted? Let each such extracted interior object be != with all other same-named extractions, and let each have expandos assigned that (vacuously) won't be preserved on next extraction.

I hope I have stated positions accurately.

Yup!

# K. Gadd (11 years ago)

Did anyone address what should be done in the use case where it's necessary for information to 'tag along' with an array or typed array, for interop purposes? The existence of interior binary data objects seems to complicate this further; for example I had said that it seems like WeakMap allows attaching information to a typed array in that case even if it isn't extensible. If interior objects lose identity, though, it now becomes literally impossible for data to follow an instance of Uint32array (or whatever) around the runtime, which is kind of troubling. Obviously I understand why this is the case for interior objects.

Is the meaning of an assignment to an interior object well specified? The data is copied from the source typed array into the interior object, I assume.

I'm going to describe how I understand things and from that how it seems like they could work: At present when you construct a typed array it is a view over a particular buffer. You can construct an array with a size new Uint32Array(32) in which case a buffer is allocated for you behind the scenes; you can construct an array from a buffer + offset/size pair in order to create a view over a subregion of the buffer. In both cases, the 'array' does not actually represent or contain the data, it is merely a proxy of sorts through which you can access elements of a particular type. It is my understanding that this is the same for binary data types: you can construct a heap instance of one, in which case it has an invisible backing buffer, or you can 'construct' one from an existing buffer+offset, in which case it is more like a proxy that represents the given data type at that given offset in the buffer, and when you manipulate the proxy you are manipulating the content of the buffer.

In both cases, I believe it is consistent that these objects are all 'views' or 'proxies', not actual data. The fact that you can create an instance directly creates the illusion of them being actual data but in every case it is possible for multiple instances to share the same backing store without sharing referential identity (via ===).

In both cases, I don't believe a user should expect that attaching an expando to one object instance should modify the expandos on another object instance. Given this, it seems perfectly reasonable to be able to attach expandos to a typed array, and I've previously described why this use case is relevant (interop between compilers targeting JS, and native hand-written JS, for one).

In the same sense, if typed arrays must be constructed to act as proxies for the 'interior' arrays in a binary data type, being able to attach expandos to them does not cause much harm, other than the fact that the lifetime of the expando does not match the lifetime of the underlying binary data. But this is already true for typed arrays, in a sense.

I think the best way to address the confusion of expandos on interior arrays is simply non-extensibility, as has been discussed. I don't see why non-extensibility for interior arrays requires crippling the functionality of typed arrays in general, since JS already seems to have 2-3 exposed concepts in this field (seal, freeze, preventExtensions) along with query methods to find out if those concepts apply to a given object (isSealed, isFrozen, isExtensible)

If interior arrays are not extensible, I should hope that Object.isExtensible for them returns false. If it were to return true when they have no expando support that would be incredibly confusing.

Anyway, given all this I would propose that the optimal solution (in terms of usability, at least - can't speak for the performance consequences) is for typed arrays to be extensible by default, as they are Objects that point to underlying sequences of elements, just like Array. This gives good symmetry and lets you cleanly substitute a typed array for an Array in more cases (resizability and mixed types being the big remaining differences). In cases where extensibility is a trap for the unwary or actively undesirable, like interior objects, the instance should be made non-extensible. This allows all end user code to handle cases where it is passed an interior array or object without reducing the usefulness of typed arrays.

FWIW I would also argue that a free-standing instance of any Binary Data type (that you construct with new, not using an existing buffer) should maybe be extensible by default as well, even if 'interior' instances are not. However, making binary data types always non-extensible wouldn't exactly break any compatibility or use cases, since they're a new feature - but it does mean now we have to add checks for extensibility/typeof in more cases, which is awful...

(A related area where this is a big problem for me and authors of similar packages is emulating the java/C# 'getHashCode' pattern, where objects all have an associated static hash code. Implementing this often requires attaching the computed hash to the object as an expando or via some other association like WeakMap. I think interior objects in binary data break this fundamentally, which is painful.)

# Steve Fink (11 years ago)

On 09/04/2013 02:41 PM, Brendan Eich wrote:

But "lost expandos" due to loss of identity are an especially nasty kind of bug to find. Is there any use-case here? We've never had a bug report asking us to make SpiderMonkey's typed arrays extensible, AFAIK.

We have: bugzilla.mozilla.org/show_bug.cgi?id=695438

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 10:11 PM, K. Gadd <kg at luminance.org> wrote:

Did anyone address what should be done in the use case where it's necessary for information to 'tag along' with an array or typed array, for interop purposes? The existence of interior binary data objects seems to complicate this further; for example I had said that it seems like WeakMap allows attaching information to a typed array in that case even if it isn't extensible. If interior objects lose identity, though, it now becomes literally impossible for data to follow an instance of Uint32array (or whatever) around the runtime, which is kind of troubling. Obviously I understand why this is the case for interior objects.

Is the meaning of an assignment to an interior object well specified? The data is copied from the source typed array into the interior object, I assume.

I'm going to describe how I understand things and from that how it seems like they could work: At present when you construct a typed array it is a view over a particular buffer. You can construct an array with a size new Uint32Array(32) in which case a buffer is allocated for you behind the scenes; you can construct an array from a buffer + offset/size pair in order to create a view over a subregion of the buffer. In both cases, the 'array' does not actually represent or contain the data, it is merely a proxy of sorts through which you can access elements of a particular type. It is my understanding that this is the same for binary data types: you can construct a heap instance of one, in which case it has an invisible backing buffer, or you can 'construct' one from an existing buffer+offset, in which case it is more like a proxy that represents the given data type at that given offset in the buffer, and when you manipulate the proxy you are manipulating the content of the buffer.

In both cases, I believe it is consistent that these objects are all 'views' or 'proxies', not actual data. The fact that you can create an instance directly creates the illusion of them being actual data but in every case it is possible for multiple instances to share the same backing store without sharing referential identity (via ===).

In both cases, I don't believe a user should expect that attaching an expando to one object instance should modify the expandos on another object instance. Given this, it seems perfectly reasonable to be able to attach expandos to a typed array, and I've previously described why this use case is relevant (interop between compilers targeting JS, and native hand-written JS, for one).

In the same sense, if typed arrays must be constructed to act as proxies for the 'interior' arrays in a binary data type, being able to attach expandos to them does not cause much harm, other than the fact that the lifetime of the expando does not match the lifetime of the underlying binary data. But this is already true for typed arrays, in a sense.

I think the best way to address the confusion of expandos on interior arrays is simply non-extensibility, as has been discussed. I don't see why non-extensibility for interior arrays requires crippling the functionality of typed arrays in general, since JS already seems to have 2-3 exposed concepts in this field (seal, freeze, preventExtensions) along with query methods to find out if those concepts apply to a given object (isSealed, isFrozen, isExtensible)

If interior arrays are not extensible, I should hope that Object.isExtensible for them returns false. If it were to return true when they have no expando support that would be incredibly confusing.

Anyway, given all this I would propose that the optimal solution (in terms of usability, at least - can't speak for the performance consequences) is for typed arrays to be extensible by default, as they are Objects that point to underlying sequences of elements, just like Array. This gives good symmetry and lets you cleanly substitute a typed array for an Array in more cases (resizability and mixed types being the big remaining differences). In cases where extensibility is a trap for the unwary or actively undesirable, like interior objects, the instance should be made non-extensible. This allows all end user code to handle cases where it is passed an interior array or object without reducing the usefulness of typed arrays.

I can sort of buy that this:

var x = struct.f; // it's an interior array

could be non-extensible. But it all still feels a bit odd.

I think I prefer non-extensibility of typed arrays over sometimes-extensibility.

# Steve Fink (11 years ago)

On 09/04/2013 04:15 PM, Filip Pizlo wrote:

How so? It is true that some VM implementations will be better than others. But ultimately every VM can implement every optimization that every other VM has; in fact my impression is that this is exactly what is happening as we speak.

So, it doesn't make much sense to make language design decisions because it might make some implementor's life easier right now. If you could argue that something will /never/ be efficient if we add feature X, then that might be an interesting argument. But as soon as we identify one sensible optimization strategy for making something free, I would tend to think that this is sufficient to conclude that the feature is free and there is no need to constrain it. If we don't do this then we risk adding cargo-cult performance features that rapidly become obsolete.

This general argument bothers me slightly, because it assumes no opportunity cost in making something free(ish). Even if you can demonstrate that allowing X can be made fast, it isn't a complete argument for allowing X, since disallowing X might enable some other optimization or feature or semantic simplification. Such demonstrations are still useful, since they can shoot down objections based solely on performance.

But maybe I'm misinterpreting "...sufficient to conclude...that there is no need to constrain [the feature]." Perhaps you only meant that there is no need to constrain it for reasons of performance? If so, then you only need consider the opportunity cost of other optimizations.

# Filip Pizlo (11 years ago)

On Sep 4, 2013, at 11:17 PM, Steve Fink <sphink at gmail.com> wrote:

This general argument bothers me slightly, because it assumes no opportunity cost in making something free(ish). Even if you can demonstrate that allowing X can be made fast, it isn't a complete argument for allowing X, since disallowing X might enable some other optimization or feature or semantic simplification. Such demonstrations are still useful, since they can shoot down objections based solely on performance.

But maybe I'm misinterpreting "...sufficient to conclude...that there is no need to constrain [the feature]." Perhaps you only meant that there is no need to constrain it for reasons of performance? If so, then you only need consider the opportunity cost of other optimizations.

Yeah, I might have overstated this. My gut intuition is that performance shouldn't be a great reason for deciding PL features to begin with. But in the cases where you have the urge to add or remove a feature solely because of performance, I think that a sufficient counterargument is to show that there exists some sensible optimization strategy that obviates the feature (or its removal). And yes, opportunity cost ought to be considered. If you can make such a counterargument then probably the feature should be judged on its software engineering merits and not on performance merits.

My experience implementing free-if-unused custom properties on typed arrays was that it was a small hack with limited cost - I just relied on a lot of preexisting nastiness that we have to have, that was already there to deal with DOM semantics and to enable similar "free-if-unused" features for other objects.

# Andreas Rossberg (11 years ago)

On 5 September 2013 03:11, Niko Matsakis <niko at alum.mit.edu> wrote:

On Wed, Sep 04, 2013 at 02:41:24PM -0700, Brendan Eich wrote:

Both adds up to two, not one, so the counterargument is odd beats odder, and prohibiting expandos keeps the oddness down to == and nothing else.

I just want to agree with this. I think the behavior of == is unfortunate, but much more of a corner case than expando properties disappearing and reappearing willy nilly as users switch between otherwise identical views on the same data.

I feel like == can be explained by the fact that these are distinct objects pointing at the same underyling buffer. People get aliasing. But the fact that expando properties would live on the wrapper and not on the underlying data is quite surprising and counterintuitive -- some of the data (the elements) are aliased, but others are not.

Maybe it actually is worth considering a different equality semantics for structs and typed arrays. In essence, they are a kind of super-fat pointer, and we could give them the usual notion of (fat) pointer equality. That is, two objects are equal if they are equivalent views to the same backing store. It would make them value types, more or less.

As an implementor, I don't like this idea too much :), but from a user perspective it would probably be saner.

# Dmitry Lomov (11 years ago)

On Thu, Sep 5, 2013 at 4:29 AM, Filip Pizlo <fpizlo at apple.com> wrote:

This is where I return to the objectyness point: typed arrays are already spec'd to have a bunch of heavy reference-to-object behavior. So making then expandable is no big deal. And making then non-expandable means that we'll now live in a weirdo world where we have four different concepts of what it means to be a value:

A) Full blown reference objects that you can do weird things to, like add properties and change proto, etc. You can also make one non-extensible at your discretion, which fits into the bat-poop crazy "you can do anything" philosophy of full blown objects. And that's great - that's the heart of the language, and I happen to enjoy it.

B) Object types that are always non-extensible but otherwise still objecty

  • they have a prototype that is observable, they reveal their identity via ==, and you can actually inject stuff into them by modifying the appropriate Object.prototype.

C) Values with whatever value type semantics we come up with in the future.

D) Primitives.

Now, I hope that we could get C and D to be as close as possible to each other. But that still leaves three different behaviors. This introduces a learning curve. That's why (B) offends me. It's subtly different from (A) and clearly different from either (C) or (D).

Now, we actually also have a totally alternate behavior, used by binary data. And my argument there is that I wouldn't get too offended by binary data acting weird, because the very notion of exposing binary data is weird to begin with. I expect it to be used only for special graphicsy stuff and not for general-purpose "value types" for normal JS programs. So it's OK to me if binary data is both weird and inconsistent with everything else. And no, I still don't view "typed arrays" as being part of binary data - it already appears to be the case that typed arrays have different buffer behavior to the struct types. So they're just different. And that's fine.

You are underestimating the diversity of species in (A). You can create "full blown reference objects" that are non-extensible in plain JavaScript today: just call Object.preventExtensions(this) in constructor! So (B) as you define it is a subset of (A).

Note however that typed arrays as implemented by many vendors today are neither in (A) nor in (B) - you can extend typed array with named properties, but you cannot extend typed array with indexed properties. There is no "Object.preventIndexedExtensions", so this sort of (non-)extensibility is indeed a weird case. If the goal is to reduce the zoo of object kinds, making typed arrays completely non-extensibie puts them firmly in (A): they are just full-blown objects that happen to be born non-extensible - something that is already totally possible in the language today.

As to binary data acting "weird": it might appear weird, but this is again the kind of "weirdness" that is possible in JavaScript today, as evidenced by Dave Herman's and mine polyfill: dherman/structs.js. (almost; there are subtle differences re whether struct fields are data properties or getters/setters, but the key behaviors of assignments and equality are modelled accurately).

Dmitry

# Niko Matsakis (11 years ago)

On Thu, Sep 05, 2013 at 09:15:11AM +0200, Andreas Rossberg wrote:

Maybe it actually is worth considering a different equality semantics for structs and typed arrays. In essence, they are a kind of super-fat pointer, and we could give them the usual notion of (fat) pointer equality. That is, two objects are equal if they are equivalent views to the same backing store. It would make them value types, more or less.

As an implementor, I don't like this idea too much :), but from a user perspective it would probably be saner.

Perhaps. Note that arrays can still point at overlapping memory without being equal. So the same basic guarantees hold as today:

== => aliasing != => nothing in particular

Niko

# Andreas Rossberg (11 years ago)

On 5 September 2013 12:24, Niko Matsakis <niko at alum.mit.edu> wrote:

Perhaps. Note that arrays can still point at overlapping memory without being equal. So the same basic guarantees hold as today:

== => aliasing != => nothing in particular

Yes, sure. The same holds for plain pointer equality in C, though (thanks to primitive types of different size, unions, array semantics, and other stuff). Pointer comparison is not for detecting aliasing, unless you know what you are doing.