Thoughts on WeakMaps

# David Bruant (14 years ago)

I'm currently working on the WeakMap documentation [1] and I have thought of two points:

  1. myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything?

  2. The notion of weak reference as used in current WeakMap seems to be assuming that the garbage collector will work on whether objects are reachable or not. I have read (I thought it was the wikipedia page, but it apparently wasn't) that there is another notion for garbage collection which is whether an object will be used or not in the future. Of course, this notion is far more difficult to determine than reachability, but this is not my point. Let imagine for a minute that a lot of improvements is done in the field of object non-future-use. Will WeakMap be any different than a regular Map? If an engine is able to tell that an object will not be reachable, does it matter if there are remaining (soft or strong) references?

The consequence of this second point is wondering whether it's a good idea to standardize WeakMap (instead of Map) at all.

David

[1] developer.mozilla.org/en/JavaScript/Reference/Global_Objects/WeakMap

# Mike Samuel (14 years ago)

2011/6/6 David Bruant <david.bruant at labri.fr>:

  1. The notion of weak reference as used in current WeakMap seems to be assuming that the garbage collector will work on whether objects are reachable or not. I have read (I thought it was the wikipedia page, but it apparently wasn't) that there is another notion for garbage collection which is whether an object will be used or not in the future. Of course, this

I don't know what you read, but there's one way of thinking about GC is that an ideal garbage collector would collects all objects that will not be used. All real garbage collectors have to conservatively approximate that ideal. This is similar to thinking about LRU and similar cache algorithms as approximating (non conservatively) the next n to be requested.

notion is far more difficult to determine than reachability, but this is not my point.

Let imagine for a minute that a lot of improvements is done in the field of object non-future-use. Will WeakMap be any different than a regular Map? If an engine is able to tell that an object will not be reachable, does it matter if there are remaining (soft or strong) references?

The consequence of this second point is wondering whether it's a good idea to standardize WeakMap (instead of Map) at all.

Besides a lack of out-of-memory errors and performance, a program using an object key map that doesn't use ephemeron pairs shouldn't behave differently than one that does. But developers need to have some idea of memory performance when choosing an appropriate collection. If you're documenting, I would document the behavior around GC upon which devs can rely.

# David Bruant (14 years ago)

Le 06/06/2011 17:41, Mike Samuel a écrit :

2011/6/6 David Bruant <david.bruant at labri.fr>:

The consequence of this second point is wondering whether it's a good idea to standardize WeakMap (instead of Map) at all. Besides a lack of out-of-memory errors and performance, a program using an object key map that doesn't use ephemeron pairs shouldn't behave differently than one that does. But developers need to have some idea of memory performance when choosing an appropriate collection. If you're documenting, I would document the behavior around GC upon which devs can rely.

Not only there is a memory performance difference, but also a key enumerability difference. With WeakMaps, keys cannot be enumerated in a determinist manner while they can with Map.

# Mike Samuel (14 years ago)

2011/6/6 David Bruant <david.bruant at labri.fr>:

Le 06/06/2011 17:41, Mike Samuel a écrit :

2011/6/6 David Bruant <david.bruant at labri.fr>:

The consequence of this second point is wondering whether it's a good idea to standardize WeakMap (instead of Map) at all. Besides a lack of out-of-memory errors and performance, a program using an object key map that doesn't use ephemeron pairs shouldn't behave differently than one that does.  But developers need to have some idea of memory performance when choosing an appropriate collection.  If you're documenting, I would document the behavior around GC upon which devs can rely. Not only there is a memory performance difference, but also a key enumerability difference. With WeakMaps, keys cannot be enumerated in a determinist manner while they can with Map.

Please ignore my last email. I forgot about Map as a distinct type and thought you were proposing renaming WeakMap to Map.

# Mark S. Miller (14 years ago)

On Mon, Jun 6, 2011 at 2:46 PM, David Bruant <david.bruant at labri.fr> wrote:

Le 06/06/2011 17:41, Mike Samuel a écrit :

2011/6/6 David Bruant <david.bruant at labri.fr>:

The consequence of this second point is wondering whether it's a good idea

to standardize WeakMap (instead of Map) at all. Besides a lack of out-of-memory errors and performance, a program using an object key map that doesn't use ephemeron pairs shouldn't behave differently than one that does. But developers need to have some idea of memory performance when choosing an appropriate collection. If you're documenting, I would document the behavior around GC upon which devs can rely. Not only there is a memory performance difference, but also a key enumerability difference. With WeakMaps, keys cannot be enumerated in a determinist manner while they can with Map.

This enumerability is crucial to the difference in how conservative an approximation to "will not be used" needs to be to be safe. If we got rid of WeakMap and only had enumerable Map, the GC would need to determine that a given map would never be enumerated in order to gc it as aggressively. Put another way, resource issues aside, WeakMap's contract is merely the non-enumerable subset of Map's contract. The GC knows that a WeakMap won't be enumerated because it can't be.

# David Bruant (14 years ago)

Le 06/06/2011 18:30, Mark S. Miller a écrit :

On Mon, Jun 6, 2011 at 2:46 PM, David Bruant <david.bruant at labri.fr <mailto:david.bruant at labri.fr>> wrote:

Le 06/06/2011 17:41, Mike Samuel a écrit :
> 2011/6/6 David Bruant <david.bruant at labri.fr
<mailto:david.bruant at labri.fr>>:
>> The consequence of this second point is wondering whether it's
a good idea
>> to standardize WeakMap (instead of Map) at all.
> Besides a lack of out-of-memory errors and performance, a program
> using an object key map that doesn't use ephemeron pairs shouldn't
> behave differently than one that does.  But developers need to have
> some idea of memory performance when choosing an appropriate
> collection.  If you're documenting, I would document the behavior
> around GC upon which devs can rely.
Not only there is a memory performance difference, but also a key
enumerability difference. With WeakMaps, keys cannot be enumerated
in a
determinist manner while they can with Map.

This enumerability is crucial to the difference in how conservative an approximation to "will not be used" needs to be to be safe. If we got rid of WeakMap and only had enumerable Map, the GC would need to determine that a given map would never be enumerated in order to gc it as aggressively. Put another way, resource issues aside, WeakMap's contract is merely the non-enumerable subset of Map's contract. The GC knows that a WeakMap won't be enumerated because it can't be.

So maybe that the harmony:simple_maps_and_sets proposal should do the "todo" and add the iteration (or an enumeration) method. Otherwise, the Map proposal has no more value than the WeakMap proposal ;-)

# Mark S. Miller (14 years ago)

On Tue, Jun 7, 2011 at 6:54 AM, David Bruant <david.bruant at labri.fr> wrote:

Le 06/06/2011 18:30, Mark S. Miller a écrit :

On Mon, Jun 6, 2011 at 2:46 PM, David Bruant <david.bruant at labri.fr>wrote:

Le 06/06/2011 17:41, Mike Samuel a écrit :

2011/6/6 David Bruant <david.bruant at labri.fr>:

The consequence of this second point is wondering whether it's a good idea

to standardize WeakMap (instead of Map) at all. Besides a lack of out-of-memory errors and performance, a program using an object key map that doesn't use ephemeron pairs shouldn't behave differently than one that does. But developers need to have some idea of memory performance when choosing an appropriate collection. If you're documenting, I would document the behavior around GC upon which devs can rely. Not only there is a memory performance difference, but also a key enumerability difference. With WeakMaps, keys cannot be enumerated in a determinist manner while they can with Map.

This enumerability is crucial to the difference in how conservative an approximation to "will not be used" needs to be to be safe. If we got rid of WeakMap and only had enumerable Map, the GC would need to determine that a given map would never be enumerated in order to gc it as aggressively. Put another way, resource issues aside, WeakMap's contract is merely the non-enumerable subset of Map's contract. The GC knows that a WeakMap won't be enumerated because it can't be.

So maybe that the harmony:simple_maps_and_sets proposal should do the "todo" and add the iteration (or an enumeration) method.

That's the plan.

Otherwise, the Map proposal has no more value than the WeakMap proposal ;-)

Correct.

# David Bruant (14 years ago)

Le 06/06/2011 17:31, David Bruant a écrit :

myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything?

Anyone has thoughts on this point?

# Mark S. Miller (14 years ago)

On Tue, Jun 7, 2011 at 7:41 AM, David Bruant <david.bruant at labri.fr> wrote:

Le 06/06/2011 17:31, David Bruant a écrit :

myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything?

Anyone has thoughts on this point?

I prefer the present behavior because

  1. I'll be proposing that WeakMaps have "has" and "delete" methods so they better match the Map proposal and what Mozilla has already implemented. Given this, the four methods seem even more field-like. Setters don't return anything. [[Put]] doesn't return anything. And "foo.bar = baz" returns baz, not the previous value of foo.bar.

  2. The current API makes it easier to reason about when you're using read-rights vs write-rights. A bound set method (weakMap.set.bind(weakMap)) can be handed out to give rights to store into a given weak map without giving out rights to retrieve from it. And vice versa for a bound get method.

  3. If we do this for "set", we'll be tempted to do it for "delete". Let's keep things simple.

# Juan Ignacio Dopazo (14 years ago)

On Tue, Jun 7, 2011 at 11:41 AM, David Bruant <david.bruant at labri.fr> wrote:

Le 06/06/2011 17:31, David Bruant a écrit :

myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything?

Anyone has thoughts on this point?

Set and delete could return the weakmap.

Juan

# Allen Wirfs-Brock (14 years ago)

On Jun 7, 2011, at 8:02 AM, Mark S. Miller wrote:

On Tue, Jun 7, 2011 at 7:41 AM, David Bruant <david.bruant at labri.fr> wrote: Le 06/06/2011 17:31, David Bruant a écrit :

myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything? Anyone has thoughts on this point?

I prefer the present behavior because

  1. I'll be proposing that WeakMaps have "has" and "delete" methods so they better match the Map proposal and what Mozilla has already implemented. Given this, the four methods seem even more field-like. Setters don't return anything. [[Put]] doesn't return anything. And "foo.bar = baz" returns baz, not the previous value of foo.bar.

but weakmap set can't be used in a LHS context, so it won't ever look like a "field" access

  1. The current API makes it easier to reason about when you're using read-rights vs write-rights. A bound set method (weakMap.set.bind(weakMap)) can be handed out to give rights to store into a given weak map without giving out rights to retrieve from it. And vice versa for a bound get method.
function (k,v) {weakMap.set(k,v)}

isn't much longer for creating such a function to hand out. ((k,v)->void weakMap.set(k,v)) is actually shorter

  1. If we do this for "set", we'll be tempted to do it for "delete". Let's keep things simple

In such situations, I would prefer that set return the map. Then set can be "cascaded":

getMyMap().set(key1,value1).set(key2,value2).set(key3,value3);

or if you prefer vertical:

getMyMap() .set(key1,value1) .set(key2,value2) .set(key3,value3);

Of course, if we do this we should apply the same pattern to all similar methods we are adding...

# Brendan Eich (14 years ago)

On Jun 7, 2011, at 7:41 AM, David Bruant wrote:

Le 06/06/2011 17:31, David Bruant a écrit :

myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything? Anyone has thoughts on this point?

I've used that "return the old value" pattern with success before. Mark may have a reason for not using it here, though.

# David Herman (14 years ago)

[A propos of nothing, can I ask that you either change your font or use plain-text email? Your font shows up almost unreadably small in my mail client.]

I'm currently working on the WeakMap documentation [1] and I have thought of two points:

  1. myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything?

I don't have strong feelings about this, but I guess I have a mild preference for the way it's currently specified. I think it's useful as a rule of thumb to separate imperative actions from operations that are performed to compute a result. JS already violates this in a bunch of places, but I don't think consistency is sacrosanct here. OTOH, I don't think this is all that big of a deal.

  1. The notion of weak reference as used in current WeakMap seems to be assuming that the garbage collector will work on whether objects are reachable or not. I have read (I thought it was the wikipedia page, but it apparently wasn't) that there is another notion for garbage collection which is whether an object will be used or not in the future. Of course, this notion is far more difficult to determine than reachability, but this is not my point. Let imagine for a minute that a lot of improvements is done in the field of object non-future-use. Will WeakMap be any different than a regular Map? If an engine is able to tell that an object will not be reachable, does it matter if there are remaining (soft or strong) references?

Correct me if I'm wrong, but I don't see how this would be observable. If your miracle-GC can predict that e.g. the key will never be used again, even though it's actually reachable, then it's able to predict that you're never going to look it up in the table. So even though the spec describes it in terms of reachability, your miracle-GC is not observably violating the behavior of the spec.

The consequence of this second point is wondering whether it's a good idea to standardize WeakMap (instead of Map) at all.

I think this was already said in this thread, but just to be clear: WeakMap comes with different space, performance, and membership behavior than Map, and Map also exposes more operations (namely, enumeration) than WeakMap -- by design. WeakMap allows non-deterministic deletion of elements, so its operations are restricted to avoid leaking this non-determinism to programs. This is important both for portability and security.

IOW, WeakMap and Map are both there to serve different purposes and they carry different guarantees. We've discussed this in committee meetings before, but I want to make sure this is captured in public. We should also add verbiage to the proposals to make this clear.

# Allen Wirfs-Brock (14 years ago)

On Jun 7, 2011, at 9:02 AM, Brendan Eich wrote:

On Jun 7, 2011, at 7:41 AM, David Bruant wrote:

Le 06/06/2011 17:31, David Bruant a écrit :

myWeakMap.set(key, value) doesn't return anything. It could return the previous value for the key (if such a thing exists). Is it intentional that the set function doesn't return anything? Anyone has thoughts on this point?

I've used that "return the old value" pattern with success before. Mark may have a reason for not using it here, though.

The "return the old value" pattern is no doubt more efficient in situations where you need to use the old value after replacing it. I don't know how common that really is. If we wanted to avoid an extra hash lookkup for read-modify-write sequences we might provide a method whose semantics were: Map.prototype.update = function(key,mutator) { //the follow get and set calls may be internally optimized to perform only a single hash lookup let old = this.get(key); let new = mutator(old); this.set(key,new); return new; }

Of course, it's questionable whether by the time you create and pass the mutator you have actually saved the cost of the extra lookup. Also, the implementation would probably have to fall back to doing a second lookup if the mutator did anything that changed the internal structure of the map.

# Mark S. Miller (14 years ago)

On Tue, Jun 7, 2011 at 9:02 AM, Brendan Eich <brendan at mozilla.com> wrote:

I've used that "return the old value" pattern with success before. Mark may have a reason for not using it here, though.

Yup. Such chaining is a dangerous emulation of the Smalltalk cascade that violates "say what you mean". Contrast:

function cascade(a) { a.bar(); a.baz(); a.bax(); }

vs

function chain(a) { a = a.bar(); a = a.baz(); a.bax(); }

These mean different things. The first is asking the same value to bar(), baz(), and bax(). The second is asking baz() of a value determined by the original a, which may not be the original a.

This all aligns well with Dave's

On Tue, Jun 7, 2011 at 9:47 AM, David Herman <dherman at mozilla.com> wrote: [...]

I think it's useful as a rule of thumb to separate imperative actions from operations that are performed to compute a result.

Also

JS already violates this in a bunch of places, but I don't think consistency is sacrosanct here.

There's no where in the standard API of the ES5 built-ins that does this return-self pattern for purposes of chaining. There are of course JS libraries, like jQuery, that make pervasive use of chaining. However, ES-next built-ins should first respect the precedent of the general style of other built-ins, in order to be least surprising.

OTOH, I don't think this is all that big of a deal.

Even from a security perspective, I don't think the leakier choice here is fatal. But it does make it yet harder to avoid leakage accidents. How big a deal this is depends on what you care about.

# Brendan Eich (14 years ago)

On Jun 7, 2011, at 1:59 PM, Mark S. Miller wrote:

Even from a security perspective, I don't think the leakier choice here is fatal. But it does make it yet harder to avoid leakage accidents. How big a deal this is depends on what you care about.

I think you've made a good case for the design as proposed. I'm happy to leave set returning undefined.

# Isaac Schlueter (14 years ago)

On Tue, Jun 7, 2011 at 13:59, Mark S. Miller <erights at google.com> wrote:

There's no where in the standard API of the ES5 built-ins that does this return-self pattern for purposes of chaining. There are of course JS libraries, like jQuery, that make pervasive use of chaining. However, ES-next built-ins should first respect the precedent of the general style of other built-ins, in order to be least surprising.

+1

IMO, the "return the set value" approach seems to make the most sense, since map.set(k, v) is conceptually akin to obj[k] = v. I've always liked that about the HTMLElement.appendChild() function.

They're all easy enough to monkey-patch, so this seems a little bikesheddy to me.