Argument in favor of adding "has" in the WeakMap API

# David Bruant (14 years ago)

I've recently been thinking about the case of memoizing. Let's assume that a function f takes an object o as a parameter and that f is a pure function (results only depends on argument).

function f(o){ /* perform a computation, return the result */ }

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){ var cache = WeakMap(); return function(o){ var res; if(WeakMap.has(o)) return cache.get(o); res = f.apply(this, o);

cache.set(o, res);

}; }

But as you can see, this implementation requires a .has method. harmony:weak_maps shows how to implement has (and delete) on top of "get" and "set", but this comes to the cost of an object per WeakMap (well, actually, the same mascot could be factorized out and used for all ExplicitSoftOwnField) and the cost of being a non-native implementation.

As an argument to add delete to the API, I'd say that it will be an encouragement for people to remove keys from the weak map. In the design of WeakMap is written: "for each mapping k ? v in a reachable weak map m, if k is reachable then v is reachable". If people see that the API is "get+set", they won't think of memory at all, while if they see a "delete" method in the API, they are more likely to wonder why it's here and understand that they should use it to free memory up. This is, of course, nothing else than a guess.

# felix (14 years ago)

On Wed, May 11, 2011 at 11:16, David Bruant <david.bruant at labri.fr> wrote:

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){   var cache = WeakMap();   return function(o){     var res;     if(WeakMap.has(o))       return cache.get(o);     res = f.apply(this, o);

cache.set(o, res);   }; }

I'm not sure I understand your code.

it's not clear to me why you need .has at all, since you can check the return value of .get instead. in the case where 'undefined' is a valid value that you would want to memoize, you can store a unique object to represent 'undefined'.

also, memoization is usually caching on argument values rather than argument identity, but weakmap lookups are by object identity not object value. You can't use non-object keys in a weekmap, so in order to gain the benefits of the memoization above, a caller would have to be careful to avoid constructing new objects with the same values, and somehow keep re-using the same argument objects, which is pretty inconvenient for typical uses of memoization.

# Mark S. Miller (14 years ago)

On Wed, May 11, 2011 at 11:16 AM, David Bruant <david.bruant at labri.fr>wrote:

Hi,

I've recently been thinking about the case of memoizing. Let's assume that a function f takes an object o as a parameter and that f is a pure function (results only depends on argument).

function f(o){ /* perform a computation, return the result */ }

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){ var cache = WeakMap(); return function(o){ var res; if(WeakMap.has(o)) return cache.get(o); res = f.apply(this, o);

cache.set(o, res);

}; }

But as you can see, this implementation requires a .has method. harmony:weak_maps shows how to implement has (and delete) on top of "get" and "set", but this comes to the cost of an object per WeakMap (well, actually, the same mascot could be factorized out and used for all ExplicitSoftOwnField) and the cost of being a non-native implementation.

As an argument to add delete to the API, I'd say that it will be an encouragement for people to remove keys from the weak map. In the design of WeakMap is written: "for each mapping k => v in a reachable weak map m, if k is reachable then v is reachable". If people see that the API is "get+set", they won't think of memory at all, while if they see a "delete" method in the API, they are more likely to wonder why it's here and understand that they should use it to free memory up. This is, of course, nothing else than a guess.

After offline conversations with Andreas Gal and Dave Herman, we've changed the WeakMaps API we'd like to propose for Harmony to match approximately what Andreas has already implemented for FF6.0a1 (Nightly). This moves the methods up to WeakMap.prototype approximately as Arv suggested in his comments on the current proposal page. It hangs the WeakMap state off of internal [[???]] properties of WeakMap instances which only these methods can access. And it adds "has" and "delete" methods, making the base abstraction more like the pattern explained at < harmony:weak_maps#explicit_soft_own_fields>.

Since WeakMaps are already accepted for ES-next, and since the upcoming May meeting is already overloaded with strawmen under consideration, this discussion about WeakMap API changes can wait till the July mtg. Of course, we can begin discussion on the list now.

More later.

# David Bruant (14 years ago)

Le 11/05/2011 20:38, felix a écrit :

On Wed, May 11, 2011 at 11:16, David Bruant <david.bruant at labri.fr> wrote:

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){ var cache = WeakMap(); return function(o){ var res; if(WeakMap.has(o)) return cache.get(o); res = f.apply(this, o);

cache.set(o, res);

}; }

I'm not sure I understand your code.

it's not clear to me why you need .has at all, since you can check the return value of .get instead. in the case where 'undefined' is a valid value that you would want to memoize, you can store a unique object to represent 'undefined'.

The ambiguity of "get" returning "undefined" when the key is actually undefined or when it is defined but its value is "undefined" already happened with objects. The solution to that was the "in" operator (or Object.prototype.hasOwnProperty depending on cases). I'm just asking an equivalent for weak maps. I agree its not necessary as proved in harmony:weak_maps#explicit_soft_own_fields . However, a native implementation would be a good addition.

also, memoization is usually caching on argument values rather than argument identity,

I consider that the most important word in your sentence is "usually". Just to clarify, the argument is either a primitive value (es5.github.com/#x4.3.2) or an object. Memoization is usually caching on primitive value. Caching on objects could have been an idea, but it would have created a (potentially big) memory leak and lookup would have been linear. Now, with WeakMap, memoizing on objects doesn't leak memory (the gc comes to clean the weak map for you!) and lookup is handled by the engine which means it is at worst O(log(n)) and certainly constant and small in practice.

but weakmap lookups are by object identity not object value. You can't use non-object keys in a weekmap, so in order to gain the benefits of the memoization above, a caller would have to be careful to avoid constructing new objects with the same values, and somehow keep re-using the same argument objects, which is pretty inconvenient for typical uses of memoization.

So you say. I'd like a proof of that statement. Anyway, I am not saying that object-based memoization will solve all problems. But if people have a use case for it, why not. Once again, above you said "usually"; we are so used to not have these kinds of things (WeakMaps, maybe other features) that we've found workaround over the years. It may be time to question these choices in our library/applications and see if weak maps aren't a good fit now that we have them (or are about to).

Regardless, if people want to use object-identity-based memoization and it forces them to think of their object instanciations more carefully than creating new objects all the time, then I see this as a benefit, not a problem.

"Reusing when possible instead of creating new things to serve the same purpose". Sounds like I want to save the planet from consumerism :-)

# Oliver Hunt (14 years ago)

On May 11, 2011, at 12:38 PM, David Bruant wrote:

Le 11/05/2011 20:38, felix a écrit :

On Wed, May 11, 2011 at 11:16, David Bruant <david.bruant at labri.fr> wrote:

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){ var cache = WeakMap(); return function(o){ var res; if(WeakMap.has(o)) return cache.get(o); res = f.apply(this, o);

cache.set(o, res); }; }

I'm not sure I understand your code.

it's not clear to me why you need .has at all, since you can check the return value of .get instead. in the case where 'undefined' is a valid value that you would want to memoize, you can store a unique object to represent 'undefined'. The ambiguity of "get" returning "undefined" when the key is actually undefined or when it is defined but its value is "undefined" already happened with objects. The solution to that was the "in" operator (or Object.prototype.hasOwnProperty depending on cases). I'm just asking an equivalent for weak maps. I agree its not necessary as proved in harmony:weak_maps#explicit_soft_own_fields . However, a native implementation would be a good addition.

So you want to do if (map.has(bar)) wiffle = map.get(bar)

or some such?

The problem here is that you can't guarantee that GC won't happen between those two calls, and therefore you could still end up getting undefined in response to the get.

GC is (from the language's perspective) non-deterministic, so any non-atomic operation with a weak map leads to "non-sensical" behaviour.

# Brendan Eich (14 years ago)

On May 11, 2011, at 12:44 PM, Oliver Hunt wrote:

So you want to do if (map.has(bar)) wiffle = map.get(bar)

or some such?

The problem here is that you can't guarantee that GC won't happen between those two calls, and therefore you could still end up getting undefined in response to the get.

Not if bar refers to the key object, since it must be a strong ref.

If you do

if (map.has(bar)) { bar = null; // might GC here, or do stuff that might wiffle = map.get(bar); . . . }

then you get what you deserve.

GC is (from the language's perspective) non-deterministic, so any non-atomic operation with a weak map leads to "non-sensical" behaviour.

True, but this does not apply to

if (map.has(bar)) wiffle = map.get(bar);

as written.

# Oliver Hunt (14 years ago)

On May 11, 2011, at 12:54 PM, Brendan Eich wrote:

On May 11, 2011, at 12:44 PM, Oliver Hunt wrote:

So you want to do if (map.has(bar)) wiffle = map.get(bar)

or some such?

The problem here is that you can't guarantee that GC won't happen between those two calls, and therefore you could still end up getting undefined in response to the get.

Not if bar refers to the key object, since it must be a strong ref.

Re-reading the weakmap definition makes weakmaps seem fundamentally different to any other weakmap API i've ever encountered. The most common weakmap use cases i am aware of are all essentially caches, and this definition doesn't support that use case at all.

It seems completely bizarre to me that the key should act as a root for the value it is associated with.

# David Bruant (14 years ago)

Le 11/05/2011 21:54, Brendan Eich a écrit :

On May 11, 2011, at 12:44 PM, Oliver Hunt wrote:

So you want to do if (map.has(bar)) wiffle = map.get(bar)

or some such?

The problem here is that you can't guarantee that GC won't happen between those two calls, and therefore you could still end up getting undefined in response to the get. Not if bar refers to the key object, since it must be a strong ref.

If you do

if (map.has(bar)) { bar = null; // might GC here, or do stuff that might wiffle = map.get(bar); . . . }

then you get what you deserve.

GC is (from the language's perspective) non-deterministic, so any non-atomic operation with a weak map leads to "non-sensical" behaviour. True, but this does not apply to

if (map.has(bar)) wiffle = map.get(bar);

as written.

Exactly. Said in another way, if you are able to call map.has(bar), then, you hold a reference to bar, so the garbage collector cannot consider collecting the object which bar references to (by definition of "holding a reference").

As a side note, I didn't receive initial Olivier Hunt's e-mail. It appears on es-discuss archive, but I didn't receive it. I am under the impression that it's not the first time this happens to me. Am I the only one in that case?

# Brendan Eich (14 years ago)

On May 11, 2011, at 1:04 PM, Oliver Hunt wrote:

On May 11, 2011, at 12:54 PM, Brendan Eich wrote:

On May 11, 2011, at 12:44 PM, Oliver Hunt wrote:

So you want to do if (map.has(bar)) wiffle = map.get(bar)

or some such?

The problem here is that you can't guarantee that GC won't happen between those two calls, and therefore you could still end up getting undefined in response to the get.

Not if bar refers to the key object, since it must be a strong ref.

Re-reading the weakmap definition makes weakmaps seem fundamentally different to any other weakmap API i've ever encountered. The most common weakmap use cases i am aware of are all essentially caches, and this definition doesn't support that use case at all.

WeakMap is useful for caching. You just have to kill your strong refs if you don't want to hit the cache.

It seems completely bizarre to me that the key should act as a root for the value it is associated with.

You must be assuming key is a "weak pointer" that magically gets nulled when the GC runs and finds only the variable named key and the map entry referencing (weakly) the key object. But there is no such notion, because we do not want to leak the non-determistic GC schedule by allowing users to poll for null.

# David Bruant (14 years ago)

False alarm. I have finally (just) received the e-mail.

David

Le 11/05/2011 21:44, Oliver Hunt a écrit :

# Mike Samuel (14 years ago)

2011/5/11 David Bruant <david.bruant at labri.fr>:

Hi,

I've recently been thinking about the case of memoizing. Let's assume that a function f takes an object o as a parameter and that f is a pure function (results only depends on argument).

function f(o){ /* perform a computation, return the result */ }

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){   var cache = WeakMap();   return function(o){     var res;     if(WeakMap.has(o))       return cache.get(o);     res = f.apply(this, o);

cache.set(o, res);   }; }

Does this do what you were trying to accomplish with has, but without requiring has?

function memoizer(f){   var cache = WeakMap(); // Used in cache to distinguish undefined value but known from // undefined. Does not escape, so cannot be returned by f. var undefSubstitute = {};   return function(o){     var res = cache.get(o); if (res === void 0) { res = f.apply(this, o); cache.set(o, res === void 0 ? undefSubstitute : res); } else if (res === undefSubstitute) { res = void 0; } return res;   }; }

# David Bruant (14 years ago)

Le 13/05/2011 01:19, Mike Samuel a écrit :

2011/5/11 David Bruant <david.bruant at labri.fr>:

Hi,

I've recently been thinking about the case of memoizing. Let's assume that a function f takes an object o as a parameter and that f is a pure function (results only depends on argument).

function f(o){ /* perform a computation, return the result */ }

A memoizer could be written to improve f time performance (by the cost of memory, of course).

function memoizer(f){ var cache = WeakMap(); return function(o){ var res; if(WeakMap.has(o)) return cache.get(o); res = f.apply(this, o);

cache.set(o, res);

}; } Does this do what you were trying to accomplish with has, but without requiring has?

Yes, I'm sorry, I shouldn't have used the word "require". Your example does work. It is some derivation of what can be found at doku.php?id=harmony:weak_maps&s=weakmap#explicit_soft_own_fields I get that neither "has" nor "delete" are required, as proves the above link and your example. Still, I'm confident that the engine would be a better place to handle it and my point was that there are some use cases that may do "heavy" usage of a ".has" method on weak maps which in my opinion is an argument in favor of adding "has" in the WeakMap API. It turns out, it seems to already be the plan as wrote Mark Miller in a previous e-mail. The wiki just cannot be updated yet, before consensus is reached during a TC-39 meeting.

# Mark S. Miller (14 years ago)

On Thu, May 12, 2011 at 4:29 PM, David Bruant <david.bruant at labri.fr> wrote:

Le 13/05/2011 01:19, Mike Samuel a écrit :

2011/5/11 David Bruant <david.bruant at labri.fr>:

Hi,

I've recently been thinking about the case of memoizing. Let's assume that a

function f takes an object o as a parameter and that f is a pure function

(results only depends on argument).

function f(o){ /* perform a computation, return the result */ }

A memoizer could be written to improve f time performance (by the cost of

memory, of course).

function memoizer(f){ var cache = WeakMap(); return function(o){ var res; if(WeakMap.has(o)) return cache.get(o); res = f.apply(this, o);

cache.set(o, res);

}; } Does this do what you were trying to accomplish with has, but without requiring has? Yes, I'm sorry, I shouldn't have used the word "require". Your example does work. It is some derivation of what can be found at

doku.php?id=harmony:weak_maps&s=weakmap#explicit_soft_own_fields I get that neither "has" nor "delete" are required, as proves the above link and your example. Still, I'm confident that the engine would be a better place to handle it and my point was that there are some use cases that may do "heavy" usage of a ".has" method on weak maps which in my opinion is an argument in favor of adding "has" in the WeakMap API. It turns out, it seems to already be the plan as wrote Mark Miller in a previous e-mail. The wiki just cannot be updated yet, before consensus is reached during a TC-39 meeting.

I am planning to update the wiki to reflect what I expect to propose in July[1], which includes "has", but I won't get to it till well after the May meeting. When I do get to it, I will revisit this thread, so don't wait on my updates to discuss this here.

[1] Which is approx what Andreas Gal already implemented and is shipping in FF6.0a1 Nightly.