Proposal: Abstract References

# Kevin Smith (10 years ago)

This proposal is intended to supersede both relationships and "function bind".

zenparsing/es-abstract-refs

Enjoy!

# Domenic Denicola (10 years ago)

This is really cool, Kevin. Thanks for writing it up in more detail. I hope it gets a lot of attention.

My initial worries are largely around the ergonomics---both for authors and implementers---if this is our solution for private state. In particular, I don't think having to create a new WeakMap for every private "member" is very author-friendly. And in general, implementers have voiced (weak) objections to using weak maps for private state in saying that they're not designed to work efficiently in that manner. (And that was with the pattern of one WeakMap per class, not multiple!)

That said, this is really elegant and if nothing else it seems like good underpinnings for some slightly-higher-level private state abstraction. And of course you know I <3 the "bind operator".

Hope to hear other TC39ers' thoughts!

# Kevin Smith (10 years ago)

That said, this is really elegant and if nothing else it seems like good underpinnings for some slightly-higher-level private state abstraction.

Right - the idea is that sugar could be introduced over this for declaring private fields. Also, this proposal doesn't depend upon WeakMaps for private fields, per se. We could imagine some new, opaque thing that uses the same "referenceGet/Set/Delete" mechanism.

# Mark S. Miller (10 years ago)

On Tue, Oct 21, 2014 at 1:12 PM, Domenic Denicola < domenic at domenicdenicola.com> wrote:

This is really cool, Kevin. Thanks for writing it up in more detail. I hope it gets a lot of attention.

First, I agree this is cool. I am very positive on this direction. Kudos!

My initial worries are largely around the ergonomics---both for authors and implementers---if this is our solution for private state. In particular, I don't think having to create a new WeakMap for every private "member" is very author-friendly. And in general, implementers have voiced (weak) objections to using weak maps for private state

We've realized a long time ago that WeakMaps should be implemented using the so-called "transposed" representation. All the realistic use cases either strongly prefer the transposed representation, or are neutral between the two representations. With the transposed representation, the storage logic is essentially the same as in the once-imagined private symbol proposals.

in saying that they're not designed to work efficiently in that manner.

Actually, they are. The problem is that initial implementations date from before we appreciated the need for the transposed representation. The best thing TC39 can do for the future of WeakMaps is to proceed assuming the transposed representation.

(And that was with the pattern of one WeakMap per class, not multiple!)

Actually, that leads to better efficiency, as each WeakMap identity plays the role that private symbols would have played -- naming an individual private field. Using only one WeakMap identity per class would require some other means of naming the individual fields within that class' private state, necessitating a costly additional level of indirection.

The relationship proposal this is based on indeed uses one WeakMap per private field.

# Mark S. Miller (10 years ago)

Let's not invent a new mechanism when the existing WeakMaps already have exactly the right core semantics, and (but for a temporary implementation dead end) already admit of an efficient implementation.

# Kevin Smith (10 years ago)

Let's not invent a new mechanism when the existing WeakMaps already have exactly the right core semantics, and (but for a temporary implementation dead end) already admit of an efficient implementation.

yep - sounds good to me : )

# Domenic Denicola (10 years ago)

From: Kevin Smith [mailto:zenparsing at gmail.com]

Also, this proposal doesn't depend upon WeakMaps for private fields, per se.  We could imagine some new, opaque thing that uses the same "referenceGet/Set/Delete" mechanism.

That's a good point. Just to prove it, here's a hypothetical PrivateSymbol version:

PrivateSymbol.prototype[Symbol.referenceGet] = function (obj) {
  return obj[this];
};

PrivateSymbol.prototype[Symbol.referenceSet] = function (obj, value) {
  obj[this] = value;
};

PrivateSymbol.prototype[Symbol.referenceDelete] = function (obj) {
  delete obj[this];
};

const _x = new PrivateSymbol("x");

// ... inside a class
this::_x = x;

Note that this is pretty much exactly the same as a WeakMap in terms of author-facing ergonomics, but for implementers it gives them a separate PrivateSymbol type that they can optimize with a use case focused toward private state and object property access, instead of as an ephemeron table.

# Domenic Denicola (10 years ago)

Sorry Mark; didn't see your replies before pressing "Send." If we can 'fix' weak map implementations in this way (and nobody has said anything otherwise to my knowledge), then indeed we're good.

# Andreas Rossberg (10 years ago)

On 21 October 2014 22:31, Mark S. Miller <erights at google.com> wrote:

in saying that [weak maps] are not designed to work efficiently in that manner.

Actually, they are. The problem is that initial implementations date from before we appreciated the need for the transposed representation. The best thing TC39 can do for the future of WeakMaps is to proceed assuming the transposed representation.

While I sympathise, let me clarify that this still remains a conjecture. AFAIK, nobody has proved it empirically in a JS implementation yet, we don't know in detail how complex such an implementation would be, and what side effects it might have on general performance (e.g., via added polymorphism). It's most likely not as easy or as clear a win as you may think it is.

# Andrea Giammarchi (10 years ago)

FWIW the Multiple WeakMaps approach seems to be 2X ( Chrome ) up to 3X ( FF Nightly ) faster than single WeakMap approach but I don't know how much WeakMap instances cost in terms of GC operations so this bench might be very pointless ( and surely it's premature )

Weirdly enough, Chrome Canary seems to be able to optimize the single WeakMap approach pretty well, with a gap of 1.25X faster perfs on multiple WMs.

the bench: jsperf.com/abstract-reference

the multiple WeakMaps approach:

var PrivateCoordinatesMWM = (function () {

  var
    privates = {
      x: new WeakMap,
      y: new WeakMap
    }
  ;

  function __(self, key, value) {
    return arguments.length === 3 ?
      (privates[key].set(self, value), value) :
      privates[key].get(self);
  }

  function PrivateCoordinatesMWM(x, y) {
    __(this, 'x', x);
    __(this, 'y', y);
  }

  PrivateCoordinatesMWM.prototype.toString = function toString() {
    return ''.concat(
      '[', __(this, 'x'), 'x', __(this, 'y'), ']'
    );
  };

  return PrivateCoordinatesMWM;

}());

and the single WeakMap approach:

var PrivateCoordinatesSWM = (function () {

  var privates = new WeakMap;

  function __(self, key, value) {
    var pvt = privates.get(self);
    if (arguments.length === 3) {
      if (!pvt) {
        privates.set(self, pvt = {});
      }
      return (pvt[key] = value);
    }
    return pvt && pvt[key];
  }

  function PrivateCoordinatesSWM(x, y) {
    __(this, 'x', x);
    __(this, 'y', y);
  }

  PrivateCoordinatesSWM.prototype.toString = function toString() {
    return ''.concat(
      '[', __(this, 'x'), 'x', __(this, 'y'), ']'
    );
  };

  return PrivateCoordinatesSWM;

}());

Best

# Andreas Rossberg (10 years ago)

On 22 October 2014 14:18, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:

FWIW the Multiple WeakMaps approach seems to be 2X ( Chrome ) up to 3X ( FF Nightly ) faster than single WeakMap approach but I don't know how much WeakMap instances cost in terms of GC operations so this bench might be very pointless ( and surely it's premature )

Weirdly enough, Chrome Canary seems to be able to optimize the single WeakMap approach pretty well, with a gap of 1.25X faster perfs on multiple WMs.

If you really want to micro-benchmark, then perhaps at least use more reasonable code that avoids unnecessary indirections distorting the results. That is:

var PrivateCoordinatesMWM = (function () {
  var x_ = new WeakMap
  var y_ = new WeakMap

  function PrivateCoordinatesMWM(x, y) {
    x_.set(this, x)
    y_.set(this, y)
  }

  PrivateCoordinatesMWM.prototype.toString = function toString() {
    return ''.concat('[', x_.get(this), y_.get(this), ']')
  };

  return PrivateCoordinatesMWM
}())

This also happens to be more readable. ;)

# Andrea Giammarchi (10 years ago)

I was trying to actually use same amount of indirections, since that's probably more likely how real code might look like but you are right, hand crafting per each case the pattern we have a different winner, which is the single WeakMap instead of the multiple one.

I guess the reason is the double x/y.set and x/y.get VS only one, here the new bench: jsperf.com/abstract-reference/2

and here the hand crafted code:

var result = [];

// Multiple WeakMaps approach
var PrivateCoordinatesMWM = (function () {

  var
    _x = new WeakMap,
    _y = new WeakMap
  ;

  function PrivateCoordinatesMWM(x, y) {
    _x.set(this, x);
    _y.set(this, y);
  }

  PrivateCoordinatesMWM.prototype.toString = function toString() {
    return ''.concat(
      '[', _x.get(this), 'x', _y.get(this), ']'
    );
  };

  return PrivateCoordinatesMWM;

}());

// Single WeakMap approach
var PrivateCoordinatesSWM = (function () {

  var _wm = new WeakMap;

  function PrivateCoordinatesSWM(x, y) {
    _wm.set(this, {x: x, y: y});
  }

  PrivateCoordinatesSWM.prototype.toString = function toString() {
    var _pvt = _wm.get(this);
    return ''.concat(
      '[', _pvt.x, 'x', _pvt.y, ']'
    );
  };

  return PrivateCoordinatesSWM;

}());
# Mark S. Miller (10 years ago)

On Tue, Oct 21, 2014 at 11:59 PM, Andreas Rossberg <rossberg at google.com> wrote:

While I sympathise, let me clarify that this still remains a conjecture. AFAIK, nobody has proved it empirically in a JS implementation yet, we don't know in detail how complex such an implementation would be, and what side effects it might have on general performance (e.g., via added polymorphism). It's most likely not as easy or as clear a win as you may think it is.

The following code is an existence proof of sorts that, given only the WeakMap mechanisms you've already built + one bit of magic whose feasibility I hope we don't need to debate. I use ES5 here to make clear that no other magic or unengineered features are assumed.

var WeakMap;

(function() {
   'use strict';

   var SlowWeakMap = WeakMap;

   function FastWeakMap() {
     var token = Object.freeze(Object.create(null));
     return Object.freeze({
       get: function(key) {
         var shadow = key.[[Shadow]];
         return shadow ? shadow.get(token) : void 0;
       },
       set: function(key, value) {
         var shadow = key.[[Transposer]]
         if (!shadow) {
           shadow = new SlowWeakMap();
           key.[[Transposer]] = shadow;
         }
         shadow.set(token, value);
       },
       clear: function() {
         token = Object.freeze(Object.create({}));
       }
     });
   }

   // Don't do this until it is a complete shim
   // WeakMap = FastWeakMap;

 }());

The magic is that each object, whether frozen or not, would need in effect one extra internal mutable [[Shadow]] property.

Clearly, this expository implementation is suboptimal in many ways. But it demonstrates the following:

  • It provides the full complexity measure gains that a realistic implementation would have.

  • For each of these objects, an extra SlowWeakMap instance is allocated as its shadow.

  • For each access, an extra indirection through this SlowWeakMap is therefore needed.

  • Only objects that have been used as keys in FastWeakMaps would ever have their [[Shadow]] set, so this could also be allocated on demand, given only a bit saying whether it is present. Besides this storage of this bit, there is no other effect or cost on any non-weakmap objects.

  • Since non-weakmap code doesn't need to test this bit, there is zero runtime cost on non-weakmap code.

  • Whether an object has been used as a key or not (and therefore whether an extra shadow has been allocated or not), normal non-weak property lookup on the object is unaffected, and pays no additional cost.

A realistic implementation should seek to avoid allocating the extra shadow objects. However, even if not, we are much better off with the above scheme than we are with the current slow WeakMap.

Of course, we should proceed towards realistic implementations asap and get actual empirical data. But the above demonstration establishes that the issue in this thread should be considered settled.

# Mark S. Miller (10 years ago)

Should be

var WeakMap;

(function() {
   'use strict';

   var SlowWeakMap = WeakMap;

   function FastWeakMap() {
     var token = Object.freeze(Object.create(null));
     return Object.freeze({
       get: function(key) {
         var shadow = key.[[Shadow]];
         return shadow ? shadow.get(token) : void 0;
       },
       set: function(key, value) {
         var shadow = key.[[Shadow]]
         if (!shadow) {
           shadow = new SlowWeakMap();
           key.[[Shadow]] = shadow;
         }
         shadow.set(token, value);
       },
       clear: function() {
         token = Object.freeze(Object.create({}));
       }
     });
   }

   // Don't do this until it is a complete shim
   // WeakMap = FastWeakMap;

 }());
# Isiah Meadows (10 years ago)

I know that this could clearly work for implementing private arrays, etc. as well, but what about private integer or booleans?

let _x = 0;
let _y = false;

class Foo {
  constructor(x, y) {
    this::_x = x;
    this::_y = y;
  }

  // ...
}
# Steve Fink (10 years ago)

On 10/22/2014 07:45 AM, Mark S. Miller wrote:

  • Only objects that have been used as keys in FastWeakMaps would ever have their [[Shadow]] set, so this could also be allocated on demand, given only a bit saying whether it is present. Besides this storage of this bit, there is no other effect or cost on any non-weakmap objects.

  • Since non-weakmap code doesn't need to test this bit, there is zero runtime cost on non-weakmap code.

  • Whether an object has been used as a key or not (and therefore whether an extra shadow has been allocated or not), normal non-weak property lookup on the object is unaffected, and pays no additional cost.

Maybe it's because I work on a garbage collector, but I always think of the primary cost of WeakMaps as being the GC. The above analysis doesn't take GC into account.

In the straightforward iterative implementation, you record all of the live WeakMaps found while scanning through the heap. Then you go through them, checking each key to see if it is live. For each such key, you recursively mark the value. This marking can discover new live WeakMaps, so you iterate to a fixed point.

In the current web, this implementation seems to work fine. The worst case is O(n^2) in the size of the heap, which is pretty much fatal if you ever hit it. But that requires lots of paths through multiple WeakMaps, and in practice, it seems WeakMaps aren't being used much. I've never seen our WeakMap marking phase show up as a significant cost.

For an algorithmically more robust solution, you could add a check whenever marking an object. The check would test whether the object is (or might be) used as a WeakMap key. This would slow down marking all objects, so in practice you want to be clever about avoiding the test.

Anyway, my point is that WeakMaps have serious GC ramifications, possibly extending to non-key objects, and any performance impact analysis of using WeakMaps more extensively is incomplete without considering GC costs.

A realistic implementation should seek to avoid allocating the extra shadow objects. However, even if not, we are much better off with the above scheme than we are with the current slow WeakMap.

Perhaps. But multiple WeakMaps introduce the potential for many more cycles than a single WeakMap. So I think a final conclusion is premature.

# Mark Miller (10 years ago)

On Wed, Oct 22, 2014 at 1:44 PM, Steve Fink <sphink at gmail.com> wrote:

Maybe it's because I work on a garbage collector, but I always think of the primary cost of WeakMaps as being the GC. The above analysis doesn't take GC into account.

I should have been more explicit, but GC costs are almost my entire point. These costs aside, my FastWeakMaps are more expensive in all ways than SlowWeakMaps, though only by a constant factor, since each FastWeakMap operation must also perform the corresponding SlowWeakMap operation.

In the straightforward iterative implementation, you record all of the live WeakMaps found while scanning through the heap. Then you go through them, checking each key to see if it is live. For each such key, you recursively mark the value. This marking can discover new live WeakMaps, so you iterate to a fixed point.

That is when you find yourself doing an ephemeron collection. The point of the transposed representation is to collect most ephemeron garbage using conventional collection. Consider

var fastField = new FastWeakMap();
var slowField = new SlowWeakMap();

var transientKey = {};

var fastValue = {};
var slowValue = {};

fastField.set(key, fastValue);
slowField.set(key, slowValue);

transientKey = void 0;
fastValue = void 0;
slowValue = void 0;

At this assume that the old value of transientKey is really garbage, but that fastField and slowField are still reachable. Clearly, both the old value of fastValue and the old value of slowValue are now garbage and may be collected. Let's see what work the collector need to do to collect these.

First comes the conventional mark phase: fastField and slowField both get marked. slowField is itself a non-transposed weakmap, so when we mark it we also put it on the queue of weakmaps to be ephemeron collected. fastField internally is not a weakmap, nor does it point at one. It's only per-instance state is the token, so this gets marked as well.

Now comes the ephemeron mark phase. The only non-transposed weakmap to be considered is slowField. The non-transposed weakmap serving as transientKey's shadow never got marked because transientKey never got marked. fastValue is only reachable from transientKey's shadow, so it also never gets marked.

Here's the key important thing: In a generational collector, at this point we'd typically postpone ephemeron collection. To do so, we would complete the mark phase conventionally, by simply marking the values held by slowField. This marks slowValue, causing it to get promoted to the next older generation. THIS IS EXPENSIVE.

Finally, when we can no longer postpone ephemeron collection, when ephemeron-marking slowField, we'd notice that transientKey didn't get marked, so we wouldn't mark slowValue.

The counterpoint is the shadow weakmaps, when engaged in ephemeron collection must still check whether their keys -- to tokens within the FastWeakMaps -- are still reachable. Typically they will be. This has two counter-counterpoints:

  • For all practical use patterns we've identified, the WeakMap either lives longer than its keys, or their lifespan is comparable. When the WeakMap is known to necessarily live longer than its keys, as for class-private state, then those shadow properties can even be exempted from the key-mark-check.

  • Prior to emphemeron collection, the shadow weakmaps (and the values they carry) get promoted to older generations only along with the object they shadow.

In the current web, this implementation seems to work fine. The worst case is O(n^2) in the size of the heap, which is pretty much fatal if you ever hit it. But that requires lots of paths through multiple WeakMaps, and in practice, it seems WeakMaps aren't being used much. I've never seen our WeakMap marking phase show up as a significant cost.

Chicken and egg. If WeakMaps are used for private state (and trademarks and...), they will be used a lot. But they will only be used for those things if it isn't fatally slow to do so.

For an algorithmically more robust solution, you could add a check whenever marking an object. The check would test whether the object is (or might be) used as a WeakMap key. This would slow down marking all objects, so in practice you want to be clever about avoiding the test.

Yeah, I'm very curious about whether this can be made cheap enough that implementations would be willing to do it. If so, then everything is much better, whether we transpose the representation or not.

Anyway, my point is that WeakMaps have serious GC ramifications, possibly extending to non-key objects, and any performance impact analysis of using WeakMaps more extensively is incomplete without considering GC costs.

Exactly! I should have been clearer that these were the only costs I am concerned about here. Regarding all other costs, my example code only adds expense.

Perhaps. But multiple WeakMaps introduce the potential for many more cycles than a single WeakMap. So I think a final conclusion is premature.

I think this is a good point that I missed in my analysis. We should indeed think hard about this. Given that, in the end I must agree that a conclusion is premature. Please let's proceed to build some experiments!

# Rick Waldron (10 years ago)

On Wed, Oct 22, 2014 at 5:26 PM, Mark Miller <erights at gmail.com> wrote:

fastField.set(key, fastValue);
slowField.set(key, slowValue);

I don't mean to nit-pick, but "key" is "transientKey", right?

# Mark Miller (10 years ago)

Please do nitpick. I wrote this in too much of a hurry and it is something that needs care.

In any case, yes, transientKey.

# Steve Fink (10 years ago)

On 10/22/2014 02:26 PM, Mark Miller wrote:

I should have been more explicit, but GC costs are almost my entire point. These costs aside, my FastWeakMaps are more expensive in all ways than SlowWeakMaps, though only by a constant factor, since each FastWeakMap operation must also perform the corresponding SlowWeakMap operation.

Ah, sorry, I totally missed your point.

That is when you find yourself doing an ephemeron collection. The point of the transposed representation is to collect most ephemeron garbage using conventional collection. Consider

Ok, I get it now, and completely agree with your analysis, with the caveat that supporting [[Shadow]] gives me the heebie-jeebies. It turns a read into a write, for one thing. (The read of the key, I mean.) Could the extra shadow table be kept separate from the key object? I know! Let's use a WeakMap! :-)

Here's the key important thing: In a generational collector, at this point we'd typically postpone ephemeron collection. To do so, we would complete the mark phase conventionally, by simply marking the values held by slowField. This marks slowValue, causing it to get promoted to the next older generation. THIS IS EXPENSIVE.

Yes, this is a big deal.

Chicken and egg. If WeakMaps are used for private state (and trademarks and...), they will be used a lot. But they will only be used for those things if it isn't fatally slow to do so.

Yes, I fully expect WeakMaps to start mattering soon-ish, though I'm still procrastinating on doing anything about our current implementation.

Yeah, I'm very curious about whether this can be made cheap enough that implementations would be willing to do it. If so, then everything is much better, whether we transpose the representation or not.

We'll probably all end up at some messy point in the middle. Maybe a fast initial pass without the checks. It'll be something that depends on a bunch of assumptions for normal-case performance, but doesn't completely break down in the pathological cases.

Exactly! I should have been clearer that these were the only costs I am concerned about here. Regarding all other costs, my example code only adds expense.

If I had read more closely, I probably would have noticed that...

# Tab Atkins Jr. (10 years ago)

On Wed, Oct 22, 2014 at 1:12 PM, Isiah Meadows <impinball at gmail.com> wrote:

I know that this could clearly work for implementing private arrays, etc. as well, but what about private integer or booleans?

You misunderstand the proposal a bit. In x::y, the y is an object that is asked to respond in a special way. When y is a WeakMap, it responds to x::y = z; by calling y.set(x, z) on itself.

You can store whatever you want in there; the z value can be anything, including numbers or booleans. But the y object needs to be something that knows how to respond to the bind operator.

(Similarly, if y is a function, by default it responds to x::y(z) by calling itself with its this set to x. This makes it act similarly to x.y(z), but without having to actually define y as a property on x.)

# Kevin Smith (10 years ago)

My initial worries are largely around the ergonomics---both for authors and implementers---if this is our solution for private state. In particular, I don't think having to create a new WeakMap for every private "member" is very author-friendly.

Even without "private" sugar, I think we could use some Proxy-foo to make this more pleasant:

const Private = new Proxy({}, get() { return new WeakMap });

const { _x, _y } = Private;
# Isiah Meadows (10 years ago)

Oh. I see now. The function calling part I understand already.

For the private state ideas, I can see the GC speed nightmare, but for just function binding, it is extraordinarily similar to LiveScript's |> operator (which has always been useful) in usage, which I'm +1,000 for that.

# Kevin Smith (10 years ago)

For the private state ideas, I can see the GC speed nightmare,

See Mark's posts regarding GC.

# Isiah Meadows (10 years ago)

That was an indirect reference. Sorry for the obscurity.

For the private state ideas, I can see the GC speed nightmare,

See Mark's posts regarding GC.

# Andreas Rossberg (10 years ago)

On 22 October 2014 16:45, Mark S. Miller <erights at google.com> wrote:

Clearly, this expository implementation is suboptimal in many ways. But it demonstrates the following:

  • It provides the full complexity measure gains that a realistic implementation would have.

  • For each of these objects, an extra SlowWeakMap instance is allocated as its shadow.

  • For each access, an extra indirection through this SlowWeakMap is therefore needed.

  • Only objects that have been used as keys in FastWeakMaps would ever have their [[Shadow]] set, so this could also be allocated on demand, given only a bit saying whether it is present. Besides this storage of this bit, there is no other effect or cost on any non-weakmap objects.

  • Since non-weakmap code doesn't need to test this bit, there is zero runtime cost on non-weakmap code.

  • Whether an object has been used as a key or not (and therefore whether an extra shadow has been allocated or not), normal non-weak property lookup on the object is unaffected, and pays no additional cost.

A realistic implementation should seek to avoid allocating the extra shadow objects. However, even if not, we are much better off with the above scheme than we are with the current slow WeakMap.

Of course, we should proceed towards realistic implementations asap and get actual empirical data. But the above demonstration establishes that the issue in this thread should be considered settled.

I appreciate your analysis, but I respectfully disagree with your conclusion.

  • The extra slot and indirection for the [[Shadow]] property is an extra cost (V8 already has a very similar mechanism to implement "hidden properties" as provided by its API (and in fact, hash codes), but it is known to be (too) slow).

  • Optimising away this slot is difficult -- for starters, because it will have various secondary effects (e.g., every add/remove of an object to/from a weak map will potentially result in a layout change for that object, increasing spurious polymorphism, and potentially invalidating/deoptimising existing code).

  • Worse, when you flatten weak properties into objects, then even GC could cause object layouts to change, which is a whole new dimension of complexity.

  • On the other hand, if you do not do this flattening, performance is unlikely to ever be competitive with true private properties (we internally introduced private symbols in V8, exactly because the extra indirection for hidden properties was too costly).

I'm still sceptical that we can actually resolve this dilemma, especially when this is supposed to be a solution for private state. It seems to me that the performance implications of weakness are fundamentally at odds with the desire to have efficient private state.

# Mark S. Miller (10 years ago)

On Fri, Oct 24, 2014 at 4:19 AM, Andreas Rossberg <rossberg at google.com> wrote:

I appreciate your analysis, but I respectfully disagree with your conclusion.

  • The extra slot and indirection for the [[Shadow]] property is an extra cost (V8 already has a very similar mechanism to implement "hidden properties" as provided by its API (and in fact, hash codes), but it is known to be (too) slow).

  • Optimising away this slot is difficult -- for starters, because it will have various secondary effects (e.g., every add/remove of an object to/from a weak map will potentially result in a layout change for that object, increasing spurious polymorphism, and potentially invalidating/deoptimising existing code).

  • Worse, when you flatten weak properties into objects, then even GC could cause object layouts to change, which is a whole new dimension of complexity.

Already answered:

On Wed, Oct 22, 2014 at 2:26 PM, Mark Miller <erights at gmail.com> wrote:

[...] When the WeakMap is known to necessarily live longer than its keys, as for class-private state, then those shadow properties can even be exempted from the [ephemeron] key-mark-check.

More to the point, when the instance (transientKey) retains the WeakMap (via instance retains class retains methods retains field designators), then GC will never cause this layout change.

Caveat: If the field (WeakMap) objects in question are available to user code, then they might be .clear()ed, in which case GC would indeed cause this layout change. This is one of many reasons why I still think .clear() should not be provided by WeakMap and WeakSet. The inclusion of Weak* .clear() never achieved consensus, so I believe it violates our process for these to be included in the spec.

  • On the other hand, if you do not do this flattening, performance is unlikely to ever be competitive with true private properties (we internally introduced private symbols in V8, exactly because the extra indirection for hidden properties was too costly).

Exactly. For WeakMaps representing private state, they should engage exactly the kind of runtime machinery once associated with private symbols, but without breaking membrane transparency.

# Andreas Rossberg (9 years ago)

On 24 October 2014 22:13, Mark S. Miller <erights at google.com> wrote:

Already answered:

On Wed, Oct 22, 2014 at 2:26 PM, Mark Miller <erights at gmail.com> wrote:

[...] When the WeakMap is known to necessarily live longer than its keys, as for class-private state, then those shadow properties can even be exempted from the [ephemeron] key-mark-check.

How would that generally be "known" up-front, though? Even if you were allowed to dynamically analyse the whole heap and code, you couldn't easily predict relative life times.

More to the point, when the instance (transientKey) retains the WeakMap (via instance retains class retains methods retains field designators), then GC will never cause this layout change.

With everything being mutable in JavaScript, none of these retains-relations are necessarily invariant, as far as I can see. And even if they were, this would already be a very expensive check to make for every private field you add. It is a pretty limited scenario, too. It would not remove the general problem of the GC having to be able to rewrite object layouts, nor other use cases being unexpectedly slow. (VMs are already terrible at handling explicit property deletion efficiently.)

Exactly. For WeakMaps representing private state, they should engage exactly the kind of runtime machinery once associated with private symbols,

Private symbols do not require weakness, however. That makes them both much cheaper and much easier to implement.

In summary, there are two basic use cases for 'relations':

  • permanent (key outlives object)
  • temporary (object outlives key)

These two use cases come with different performance implications. Transient properties seem suboptimal in either case:

  • They are less efficient than private symbols for the permanent case, even if you manage to flatten (which is complicated).

  • They are less efficient than normal weak maps for the temporary case, even if you do not flatten (far worse if you flatten).

Unless we can come up with some real clever heuristics for -- cheaply and reliably! -- distinguishing the two use cases at creation time of each relation, I expect that transient properties are going to be significantly inferior to the alternatives in terms of performance. And that is still ignoring the implementation complexity (and potential space overhead that I did not mention in my bullet list).

but without breaking membrane transparency.

I'm still not sure I understand how this affects membranes specifically. A membrane would never pass a proxied object to a non-proxied function from the same side. So functions accessing (or testing for) private state on an object are a non-issue in that scenario, because this access never crosses the membrane, does it? After all, membranes work just fine with builtins or host objects (otherwise they would be useless).

# Tom Van Cutsem (9 years ago)

2014-10-27 15:00 GMT+01:00 Andreas Rossberg <rossberg at google.com>:

but without breaking membrane transparency.

I'm still not sure I understand how this affects membranes specifically. A membrane would never pass a proxied object to a non-proxied function from the same side. So functions accessing (or testing for) private state on an object are a non-issue in that scenario, because this access never crosses the membrane, does it? After all, membranes work just fine with builtins or host objects (otherwise they would be useless).

I believe what Mark was referring to is the fact that if a WeakMap crosses the membrane, it gets proxied. Manipulating the WeakMap using WeakMap.prototype.{get,set} subsequently allows the membrane to intercept and wrap those operations.

With (private) symbols, we couldn't come up with a satisfactory way to combine them with membranes. Treating the symbol as data and passing it straight through the membrane creates an obvious leak. Treating it as an opaque identity and proxying that identity results in a useless proxied identity on the other side of the membrane. W.r.t. membranes, the key point of using WeakMaps for private state is that the membrane can properly intercept the 'get' and 'set' operations.

# Mark S. Miller (9 years ago)

On Mon, Oct 27, 2014 at 7:00 AM, Andreas Rossberg <rossberg at google.com>

wrote:

In summary, there are two basic use cases for 'relations':

  • permanent (key outlives object)
  • temporary (object outlives key)

It took me a little digging, but I found an old proposal and the -- in retrospect unfortunate -- reasons for its rejection:

At < doku.php?id=strawman:weak_references&rev=1251679346>,

the original EphemeronTable constructor included an optional opt_useKeyLifetime flag.

"The opt_useKeyLifetime and opt_expectedSize parameters have no semantics but provide valuable advice to an implementation. When the underlying platform does not support accurate ephemeron collection, if the opt_useKeyLifetime flag is truthy, that suggests that the lifetime of the value should be more tied to the lifetime of the key. If opt_useKeyLifetime is falsy or absent, then the lifetime of the value should be more tied to the lifetime of the table."

The argument against, which was persuasive at the time, is at: < strawman:allen_wirfs-brock_s_comments_on_ephemeron_table_proposal

:

"2) I would drop the whole opt_useKeyLifetime business [...]. The basic problems with this sort of exposed tweak is the conceptual complexity burden it creates for user. Whether they need to really know about it or not, they will worry about trying to understand it, just because it is there. And unfortunately, in most cases they won’t understand it, but since it is there they will think that it is something important and will guess at the value they should specify or just copy some other usage. Understandability of the EphemeronTable proposal is just a whole lot better if this concept is dropped. [...]"

The two settings of the opt_useKeyLifetime flag correspond exactly to your two use cases. The assumption at the time is that implementations would do a good enough job at this optimization issue without this hint from the user -- which is the conversation we're now having. Had we kept opt_useKeyLifetime, we would not be having this discussion. However, we didn't, so need to find a solution starting from where we are.

On Wed, Oct 22, 2014 at 2:26 PM, Mark Miller <erights at gmail.com> wrote:

[...] When the WeakMap is known to necessarily live longer than its keys,

as for class-private state, then those shadow properties can even be exempted from the [ephemeron] key-mark-check.

How would that generally be "known" up-front, though? Even if you were allowed to dynamically analyse the whole heap and code, you couldn't easily predict relative life times.

Good question. First, our common assumption:

An object layout would need to (expensively) change when a. The object is a key in a WeakMap b. Ephemeron collection happens c. At the moment of ephemeron collection, the object is not deemed garbage but the WeakMap is

The counter-question is:

To avoid this expense for a given WeakMap and its keys, is it adequate to avoid the above from actually occurring, or would an implementation also need to accurately predict that it will not occur. Put another way, if we don't know ahead of time that the above will not occur, how cheap can we make the case when it does not actually occur?

If the answer is "cheap enough" and it is easy enough to make it cheap enough, then my proposal stands.

Otherwise, we need to somehow recover the functionality we lost when we gave up the opt_useKeyLifetime flag. Given where we are, I propose another pair of weak-map & weak-set collections, with exactly the same contract as the current ones, except:

  • The new weak collections would have no .clear method.
  • The hidden fields generated on the keys by the new weak collections would be expected to have their full key's lifetime, whether on not the key outlives the weak collection.

If adding a new pair of weak collections is too expensive/ugly, is there somewhere in our existing WeakMap/WeakSet API where we can find room to restore the opt_useKeyLifetime flag or its equivalent?

but without breaking membrane transparency.

I'm still not sure I understand how this affects membranes specifically.

What Tom said.

# Kevin Smith (9 years ago)

Given where we are, I propose another pair of weak-map & weak-set collections

Is a pair necessary? Do we need or want a WeakSet-ish thing for this use case?

# Mark S. Miller (9 years ago)

Yeah, for trademarking. We can always use a weakmapish mapping to null, but that applies to all the setish collections.

# Claude Pache (9 years ago)

Le 27 oct. 2014 à 17:45, Mark S. Miller <erights at google.com> a écrit :

On Mon, Oct 27, 2014 at 7:00 AM, Andreas Rossberg <rossberg at google.com> wrote:

Otherwise, we need to somehow recover the functionality we lost when we gave up the opt_useKeyLifetime flag. Given where we are, I propose another pair of weak-map & weak-set collections, with exactly the same contract as the current ones, except:

  • The new weak collections would have no .clear method.
  • The hidden fields generated on the keys by the new weak collections would be expected to have their full key's lifetime, whether on not the key outlives the weak collection.

I propose to define something:

  • with a name that corresponds to the purpose; for example "HiddenField" instead of "WeakMap". Or even "PrivateField", which is in fact more restrictive than what you can get with a plain WeakMap (more below);

  • with method names that agree with the transposed representation; maybe: .getFrom(), .setFrom(), .deleteFrom(). Compare:

    weakmap.get(key)

vs

privatefield.getFrom(object)
object::privatefield

Also, we must be careful not to consider WeakMaps as a way to achieve privacy (at least without wrapping the WeakMap). A map that holds weakly its key/value association without making the GC observable, must provide hidenness (you couldn't get a key/value pair if you don't have a prior reference to the key), but not privacy, as you can imagine methods on a weak map, such as .clear() or .fill(), that alter key/value pairs without mandating prior reference to keys and without making the GC observable.

# Domenic Denicola (9 years ago)

Still rooting for the EphemeronTable name. I'm sure it'll make a comeback someday ;)

# Andreas Rossberg (9 years ago)

On 27 October 2014 16:50, Tom Van Cutsem <tomvc.be at gmail.com> wrote:

I believe what Mark was referring to is the fact that if a WeakMap crosses the membrane, it gets proxied. Manipulating the WeakMap using WeakMap.prototype.{get,set} subsequently allows the membrane to intercept and wrap those operations.

Sure, I understand. However, my point was that in the usual private state scenario, the weak map / private symbol itself would never actually cross the membrane. All its uses are encapsulated on one side. So at least ordinary private state via private properties (e.g. as part of a class abstraction) is not actually an issue for membranes. Or am I missing something?

I'm not sure what the scenarios would be where a private symbol itself would be desired to pass a membrane in a first class manner. It seems like in all such scenarios, ordinary weak maps or other means can be used -- while still allowing more efficient private properties in the others.

With (private) symbols, we couldn't come up with a satisfactory way to combine them with membranes. Treating the symbol as data and passing it straight through the membrane creates an obvious leak. Treating it as an opaque identity and proxying that identity results in a useless proxied identity on the other side of the membrane. W.r.t. membranes, the key point of using WeakMaps for private state is that the membrane can properly intercept the 'get' and 'set' operations.

To avoid leaks, private symbols could be treated essentially like internal [[_]] properties. That is, they are simply deemed absent on a proxy, no matter what the target. Get/Has unconditionally return undefined/false, Set/Define unconditionally reject. AFAICS, that does not affect membranes handling objects with private properties correctly, see above.

Clearly, it's not a perfect solution. One question would remain: what to do about passing a private symbol itself through a membrane, in a first-class manner? One idea: the membrane could map private symbols to public symbols, and make sure that all object proxies it creates treat the set of these symbols as if they were private, i.e., filter them in reflection traps (after all, reflection on object properties is the only place where public and private symbols really differ). The only case that would not work is when the private symbols from one side of the membrane are supposed to be used to create private properties on objects from the other side. But is that a real scenario? That can't be dealt with pragmatically by just using weak maps in that case?

# Andreas Rossberg (9 years ago)

On 27 October 2014 17:45, Mark S. Miller <erights at google.com> wrote:

Good question. First, our common assumption:

An object layout would need to (expensively) change when a. The object is a key in a WeakMap b. Ephemeron collection happens c. At the moment of ephemeron collection, the object is not deemed garbage but the WeakMap is

The counter-question is:

To avoid this expense for a given WeakMap and its keys, is it adequate to avoid the above from actually occurring, or would an implementation also need to accurately predict that it will not occur. Put another way, if we don't know ahead of time that the above will not occur, how cheap can we make the case when it does not actually occur?

If the answer is "cheap enough" and it is easy enough to make it cheap enough, then my proposal stands.

I don't see what you could do if you couldn't predict it ahead of time.

But I'm afraid the more fundamental question is: Is it adequate that this case can occur at all? To be honest, I don't think it is. It's not even close to "cheap enough" in terms of implementation complexity budget.

Otherwise, we need to somehow recover the functionality we lost when we gave up the opt_useKeyLifetime flag. Given where we are, I propose another pair of weak-map & weak-set collections, with exactly the same contract as the current ones, except:

  • The new weak collections would have no .clear method.
  • The hidden fields generated on the keys by the new weak collections would be expected to have their full key's lifetime, whether on not the key outlives the weak collection.

If adding a new pair of weak collections is too expensive/ugly, is there somewhere in our existing WeakMap/WeakSet API where we can find room to restore the opt_useKeyLifetime flag or its equivalent?

Although not pretty, I would probably be fine with that, but I still feel like it is inferior to real private properties. Mostly in terms of ergonomics and in terms (misleading) performance expectations it might set (programmers assuming a "map" should not affect the objects themselves).

There are more implementation problems with transposed representations, btw: Proxies currently have no ability for storing properties, but they'd need to have that to store transposed properties. Similarly, it affects all other objects with a specialised representation (e.g. typed objects?). Transposition might also prevent certain optimisations on otherwise immutable objects, because their representation might still be mutated in non-trivial ways.

# Mark S. Miller (9 years ago)

On Wed, Oct 29, 2014 at 8:16 AM, Andreas Rossberg <rossberg at google.com> wrote:

Sure, I understand. However, my point was that in the usual private state scenario, the weak map / private symbol itself would never actually cross the membrane. All its uses are encapsulated on one side. So at least ordinary private state via private properties (e.g. as part of a class abstraction) is not actually an issue for membranes. Or am I missing something?

Yes. We're talking about class-private instance state, not instance-private instance state, so one instance can be asked about another alleged instance.

# Andreas Rossberg (9 years ago)

On 29 October 2014 21:59, Mark S. Miller <erights at google.com> wrote:

Yes. We're talking about class-private instance state, not instance-private instance state, so one instance can be asked about another alleged instance.

Yes, that's what I am talking about as well. Maybe I'm being dense, but again, where does class-private instance state require a private name / weak map to go through a membrane? Any legal instance would always originate from the same side, i.e., would be unproxied when encountered by a method. So accessing the private state works where it should work. Symmetrically, as long as proxies simply signal absence of private fields (without leaking private symbols to any trap), it also works correctly (i.e., fails) to ask a non-instance about private fields.

As I said, this would really be the exact same model as we already have for built-in / host internal state.