Mark Miller (2014-10-22T21:26:43.000Z)
d at domenic.me (2014-11-18T22:39:21.164Z)
On Wed, Oct 22, 2014 at 1:44 PM, Steve Fink <sphink at gmail.com> wrote: > Maybe it's because I work on a garbage collector, but I always think of > the primary cost of WeakMaps as being the GC. The above analysis doesn't > take GC into account. I should have been more explicit, but GC costs are almost my entire point. These costs aside, my FastWeakMaps are more expensive in all ways than SlowWeakMaps, though only by a constant factor, since each FastWeakMap operation must also perform the corresponding SlowWeakMap operation. > In the straightforward iterative implementation, you record all of the > live WeakMaps found while scanning through the heap. Then you go through > them, checking each key to see if it is live. For each such key, you > recursively mark the value. This marking can discover new live WeakMaps, > so you iterate to a fixed point. That is when you find yourself doing an ephemeron collection. The point of the transposed representation is to collect most ephemeron garbage using conventional collection. Consider ```js var fastField = new FastWeakMap(); var slowField = new SlowWeakMap(); var transientKey = {}; var fastValue = {}; var slowValue = {}; fastField.set(key, fastValue); slowField.set(key, slowValue); transientKey = void 0; fastValue = void 0; slowValue = void 0; ``` At this assume that the old value of transientKey is really garbage, but that fastField and slowField are still reachable. Clearly, both the old value of fastValue and the old value of slowValue are now garbage and may be collected. Let's see what work the collector need to do to collect these. First comes the conventional mark phase: fastField and slowField both get marked. slowField is itself a non-transposed weakmap, so when we mark it we also put it on the queue of weakmaps to be ephemeron collected. fastField internally is not a weakmap, nor does it point at one. It's only per-instance state is the token, so this gets marked as well. Now comes the ephemeron mark phase. The only non-transposed weakmap to be considered is slowField. The non-transposed weakmap serving as transientKey's shadow never got marked because transientKey never got marked. fastValue is only reachable from transientKey's shadow, so it also never gets marked. Here's the key important thing: In a generational collector, at this point we'd typically postpone ephemeron collection. To do so, we would complete the mark phase conventionally, by simply marking the values held by slowField. This marks slowValue, causing it to get promoted to the next older generation. THIS IS EXPENSIVE. Finally, when we can no longer postpone ephemeron collection, when ephemeron-marking slowField, we'd notice that transientKey didn't get marked, so we wouldn't mark slowValue. The counterpoint is the shadow weakmaps, when engaged in ephemeron collection must still check whether their keys -- to tokens within the FastWeakMaps -- are still reachable. Typically they will be. This has two counter-counterpoints: * For all practical use patterns we've identified, the WeakMap either lives longer than its keys, or their lifespan is comparable. When the WeakMap is known to necessarily live longer than its keys, as for class-private state, then those shadow properties can even be exempted from the key-mark-check. * Prior to emphemeron collection, the shadow weakmaps (and the values they carry) get promoted to older generations only along with the object they shadow. > In the current web, this implementation seems to work fine. The worst > case is O(n^2) in the size of the heap, which is pretty much fatal if > you ever hit it. But that requires lots of paths through multiple > WeakMaps, and in practice, it seems WeakMaps aren't being used much. > I've never seen our WeakMap marking phase show up as a significant cost. Chicken and egg. If WeakMaps are used for private state (and trademarks and...), they will be used a lot. But they will only be used for those things if it isn't fatally slow to do so. > For an algorithmically more robust solution, you could add a check > whenever marking an object. The check would test whether the object is > (or might be) used as a WeakMap key. This would slow down marking all > objects, so in practice you want to be clever about avoiding the test. Yeah, I'm very curious about whether this can be made cheap enough that implementations would be willing to do it. If so, then everything is much better, whether we transpose the representation or not. > Anyway, my point is that WeakMaps have serious GC ramifications, > possibly extending to non-key objects, and any performance impact > analysis of using WeakMaps more extensively is incomplete without > considering GC costs. Exactly! I should have been clearer that these were the only costs I am concerned about here. Regarding all other costs, my example code only adds expense. > Perhaps. But multiple WeakMaps introduce the potential for many more > cycles than a single WeakMap. So I think a final conclusion is premature. I think this is a good point that I missed in my analysis. We should indeed think hard about this. Given that, in the end I must agree that a conclusion is premature. Please let's proceed to build some experiments!