WeakMap, clear, and, inverted implementations

# Allen Wirfs-Brock (11 years ago)

forked from: "typeof extensibility, building on my Value Objects..."

On Aug 2, 2013, at 4:57 PM, Mark S. Miller wrote:

As long as we're on this, I'd like to reiterate that we never achieved consensus on adding .clear() to WeakMap, so its inclusion in the draft spec is a bug. Further, as explained at meetings:relationships.pdf, when we go through the use cases, it becomes clear that an implementation of WeakMap should always keep the associations on the key object, rather than in the WeakMap, so that the typical case collects garbage without needing to resort to the ephemeron collection algorithm. Adding .clear() to WeakMap makes this implementation strategy substantially more expensive.

Leveraging Tab's point, adding .clear() to WeakMap is not pragmatics. It fundamentally changes the nature of the abstraction.

To me, at least, WeakMaps were motivate by two distinct use cases. Soft fields and object-keyed caches. While we have had many discussions about soft-field related use cases, as far as I can tell, we have only had one extended discussion that touches upon the cache use case for WeakMap: esdiscuss/2013-January/028349

A "soft field" is a mechanism for adding additional instance state to an object, after the object has already been allocated. It is an association between one referencing object and one target object. The life-time of the association is the life-time of the referencing object. Arguably, ES properties already provide a soft field mechanism that is adequate for many use cases.

A Map-based cache is a mechanism for dynamically forming semantic associations between an open-end set of key objects and their associated value objects. The life-time of each associations is life-tme of the cache. For some use cases, the value associate with a key object will have a relatively large memory footprint relative to the key object. Using a WeakMap-based cache changes the life-time to each association to the smaller of the life-time of the cache and the life-fime of the association's key object. This may be significant when the cache is long lived, the values have a large footprint, and the key objects, and at least some key objects have lifetimes that are shorter than that of the cache.

Many cache use cases have a need to "flush the cache", in other words to discard all of the associations in a specific cache. This might be for memory resource management purposes (the cache is taking up too much space) or for semantic purposes (the information in the cache is no longer valid). The 'clear' method's primarily use case. It allows the manager of a cache to directly express that the cache contents is now longer needed and that key/value references from the map can be immediately ignored.

It might be argued that an alternative to 'clear' is for the cache manager to carefully maintain only a single reference to the WeakMap that currently hold the associations of the cache and to null-out that reference and/or replace it with a new empty WeakMap whenever the cache needs to be flushed. However, this approach is unfriendly to generational garbage collectors and is likely to be in conflict with the goals of the resource management cache use case. The problem is that with a generational collector, it is fairly common that keys and/or value objects referenced by a cache WeakMap will exist in a younger generation than the WeakMap object. Even if the WeakMap logically becomes garbage this fact won't be known when younger generations are collected. So, the internal key/value references in the older map may still cause key and value objects in younger generations to be retained hence defeating the goal of clearing a cache to free up memory.

.clear() on a map is an operation that can be synthesized (perhaps inefficiently) if .delete() and a key enumeration method is available. We forgot about this when we agree that WeakMaps would have no enumeration protocol (to hide observable GC effects). Rather than saying that adding clear() to WeakMap is a fundamental change of the abstraction, I'd argue that not having key enumeration (and hence a way to do clear) for WeakMaps fundamentally limited the utility of the WeakMap abstraction for one of its primary use cases. Adding clear() restored that utility.

Regarding, "it comes clear that an implementation of WeakMap should always keep associations on the key object...". I don't agree with that assertion at all as it ignore the caching use caseS. Instead, what Mark is describing is a native implementation of "soft fields" as an extension of the ES object model. There is no particular reason to wrap a weak map facade around that soft field extension. To me, it just creates confusion.

# Brendan Eich (11 years ago)

Allen Wirfs-Brock wrote:

Regarding, "it comes clear that an implementation of WeakMap should always keep associations on the key object...". I don't agree with that assertion at all as it ignore the caching use case.

Yup. Indeed one of those caching use-cases is for membranes, where soft-fields do not work.

Mark, I'm thinking of not only Mozilla and TomVC code, but this:

doku.php?id=harmony:proxies&s=proxy#garbage_collection_behavior

# Mark S. Miller (11 years ago)

On Mon, Aug 5, 2013 at 11:45 AM, Brendan Eich <brendan at mozilla.com> wrote:

Allen Wirfs-Brock wrote:

Regarding, "it comes clear that an implementation of WeakMap should always keep associations on the key object...". I don't agree with that assertion at all as it ignore the caching use case.

Yup. Indeed one of those caching use-cases is for membranes, where soft-fields do not work.

WeakMaps mean that after b at r = v (or r.set(b, v), that v is reachable if both b and r are reachable. If the association is stored on b and the collector first notices that b is unreachable, then it can collect v conventionally. If the collector first notices that r is unreachable, then to collect v while b is still live it would need to resort to ephemeron collection. And vice versa. In the absence of WeakMaps, all joins in the reference graph are GC disjunctions. WeakMaps introduce conjunctions. Conjunctions are symmetric, so whereever you put the association, you make one relative lifetime relationship cheap at the cost of making the other one expensive. Which choice then depends on use cases and expectations. That's what the gc analysis in meetings:relationships.pdf is about.

For membranes, the WeakMaps last as long as the membrane in unrevoked. During this time, references to many transient objects pass through the membrane. Storing the associations in the membrane's WeakMaps makes revocation cheap but normal operation expensive. Storing the associations in the key objects makes normal operation cheap but revocation expensive which is what I've wanted every time I've used a membrane.

For use for private fields of classes, the instances cannot outlive the relationship (WeakMap), and so, if we store the relationship in the keys, ephemeron collection will never be needed for that use case.

Let's look at some caching examples. I bet that for the typical WeakMap-based cache, the cache will outlive many of its keys, and so we'll do better by keeping the associations in the key objects.