WeakMap.prototype.clear performance (was: Questioning WeakMap.prototype.clear)
On Tue, Jan 22, 2013 at 1:55 PM, David Bruant <bruant.d at gmail.com> wrote:
For reference, quote from Allen:
generational collectors can have large latencies between the time the last reference to an object is destroyed and the when the GC actually notices. Many GC cycles may occur during that period and if a populated but unneeded large WeakMap is one of these zombie object, then it can have perf impacts.
Jason:
Maybe even if we just have two generations. (To some extent, long-lived ordinary Maps and Arrays also do this in a generational GC; but WeakMaps have much, much worse worst-case GC performance.)
Indeed, the long-lived object being the only root of a big graph is a problem unrelated to WeakMaps. If that was the only reason to add a clear method on weakmaps, an Object.clear should be considered too.
Not quite unrelated to WeakMaps, since this issue combines with the O(n^2) WeakMap GC algorithm to cause the performance problem. But otherwise, yes.
I don't understand the point about the worst-case GC performance. It may be
related to Allen's point about ephemeron algorithms which I know not enough about. I would be interested in knowing more if that's relevant. I'm not entirely sure it's relevant since the difference between .clear and dropping a weakmap is about the delta during which the storage is considered collectable.
OK, deep dive follows...
Think of our GC as a simple stop-the-world mark-and-sweep algorithm. (It's incremental, to reduce pause times, but that doesn't come into this conversation.)
During marking, when a WeakMap is marked, the keys and values are not marked. Instead the WeakMap is added to a list of marked (i.e. already-known-reachable) WeakMaps.
- Once ordinary marking is done, we do a linear pass through all marked WeakMaps. For each WeakMap, for each entry, if the key is marked, we mark the value. If this pass does not find any previously-unmarked objects, we're done. But if we did, we must then resume ordinary marking in order to mark everything (transitively) reachable from those values. (Of course this step may mark additional WeakMaps that were not marked before.) Now goto 10.
The maximum number of times the above loop may execute is the number of entries in WeakMaps. So it is an O(n^2) algorithm. The worst case occurs when WeakMap values lead to other WeakMaps or WeakMap keys, which lead to still more... I hope it's clear how this worst-case behavior can be triggered when, instead of clearing, an older WeakMap is dropped, to be collected, but possibly pointing to existing objects from which the newer WeakMap is reachable. If this happens N times, you could get a chain of N WeakMaps. Chaining is the problem. Note that in a generational GC, you would expect the oldest WeakMaps in the chain to be treated as roots.
Easiest possible hack: detect this worst-case behavior, and flip a bit so that the system does only full GCs. It essentially stops acting like a generational GC. This is not a great hack, though. GC is still O(N^2); but N is bounded by the number of garbage WeakMaps you can end up with between GC cycles.
I think the algorithm's asymptotic efficiency could be improved at the cost of adding a branch in the marking code (or, more likely—since branches in tight loops are expensive—compiling two versions of all the marking code, one for pre-WeakMap marking and one for post-WeakMap marking). That is a bigger implementation investment.
Having said all that, I bet we could hack around the worst-case GC performance. It'll be a pain, but GC is like that sometimes.
What you said above about the current GC setup that yields equivalence performance to .clear is interesting. In a nutshell, moving to a (naive?) generational GC means that you're losing something you had before. I feel there is a middleground to be found.
What you're losing when you switch to a generational GC is precision. The tradeoff is: you do a lot less GC work, but you collect objects that survive a generation much less aggressively.
What about the following: WeakMaps are allocated in their own area which is manually GC'ed with today's algorithm (which is probably implemented for the last generation?). This way, you'll know as soon as possible (next GC) if one is dead.
I don't understand. How do you know if one is dead, short of marking the entire heap?
Given WeakMaps special properties about memory management, it doesn't sound
that crazy to special case how they're being used in GC algorithms. Maybe the ideas I suggested above in 5 minutes are not perfect, but I feel special-casing weakmaps is a direction to explore regardless of the debate we're having about .clear actually since developers won't necessarily always use it and the GC needs to be fast in these cases too.
As I said earlier, I basically agree with that.
Just to be sure I understand generational GC: old generations considered as
roots when doing most GC traversing, right? That's why it may take time to realize they're actually dead?
Uh… I am going to screw this up, but I think basically if you're collecting generation k, all objects in generations ≤k that are referenced from generations >k are treated as roots.
So, yeah, basically.
On Tue, Jan 22, 2013 at 1:46 PM, Jason Orendorff <jason.orendorff at gmail.com> wrote:
On Tue, Jan 22, 2013 at 1:55 PM, David Bruant <bruant.d at gmail.com> wrote:
For reference, quote from Allen:
generational collectors can have large latencies between the time the last reference to an object is destroyed and the when the GC actually notices. Many GC cycles may occur during that period and if a populated but unneeded large WeakMap is one of these zombie object, then it can have perf impacts.
Jason:
Maybe even if we just have two generations. (To some extent, long-lived ordinary Maps and Arrays also do this in a generational GC; but WeakMaps have much, much worse worst-case GC performance.)
Indeed, the long-lived object being the only root of a big graph is a problem unrelated to WeakMaps. If that was the only reason to add a clear method on weakmaps, an Object.clear should be considered too.
Not quite unrelated to WeakMaps, since this issue combines with the O(n^2) WeakMap GC algorithm to cause the performance problem. But otherwise, yes.
I don't understand the point about the worst-case GC performance. It may be related to Allen's point about ephemeron algorithms which I know not enough about. I would be interested in knowing more if that's relevant. I'm not entirely sure it's relevant since the difference between .clear and dropping a weakmap is about the delta during which the storage is considered collectable.
OK, deep dive follows...
Think of our GC as a simple stop-the-world mark-and-sweep algorithm. (It's incremental, to reduce pause times, but that doesn't come into this conversation.)
During marking, when a WeakMap is marked, the keys and values are not marked. Instead the WeakMap is added to a list of marked (i.e. already-known-reachable) WeakMaps.
- Once ordinary marking is done, we do a linear pass through all marked WeakMaps. For each WeakMap, for each entry, if the key is marked, we mark the value. If this pass does not find any previously-unmarked objects, we're done. But if we did, we must then resume ordinary marking in order to mark everything (transitively) reachable from those values. (Of course this step may mark additional WeakMaps that were not marked before.) Now goto 10.
The maximum number of times the above loop may execute is the number of entries in WeakMaps. So it is an O(n^2) algorithm. The worst case occurs when WeakMap values lead to other WeakMaps or WeakMap keys, which lead to still more... I hope it's clear how this worst-case behavior can be triggered when, instead of clearing, an older WeakMap is dropped, to be collected, but possibly pointing to existing objects from which the newer WeakMap is reachable. If this happens N times, you could get a chain of N WeakMaps. Chaining is the problem. Note that in a generational GC, you would expect the oldest WeakMaps in the chain to be treated as roots.
Easiest possible hack: detect this worst-case behavior, and flip a bit so that the system does only full GCs. It essentially stops acting like a generational GC. This is not a great hack, though. GC is still O(N^2); but N is bounded by the number of garbage WeakMaps you can end up with between GC cycles.
I think the algorithm's asymptotic efficiency could be improved at the cost of adding a branch in the marking code (or, more likely—since branches in tight loops are expensive—compiling two versions of all the marking code, one for pre-WeakMap marking and one for post-WeakMap marking). That is a bigger implementation investment.
Yes. With this extra branch, you can use Felix's algorithm at esdiscuss/2011-March/013241.
Having said all that, I bet we could hack around the worst-case GC performance. It'll be a pain, but GC is like that sometimes.
What you said above about the current GC setup that yields equivalence performance to .clear is interesting. In a nutshell, moving to a (naive?) generational GC means that you're losing something you had before. I feel there is a middleground to be found.
What you're losing when you switch to a generational GC is precision. The tradeoff is: you do a lot less GC work, but you collect objects that survive a generation much less aggressively.
What about the following: WeakMaps are allocated in their own area which is manually GC'ed with today's algorithm (which is probably implemented for the last generation?). This way, you'll know as soon as possible (next GC) if one is dead.
I don't understand. How do you know if one is dead, short of marking the entire heap?
Given WeakMaps special properties about memory management, it doesn't sound that crazy to special case how they're being used in GC algorithms. Maybe the ideas I suggested above in 5 minutes are not perfect, but I feel special-casing weakmaps is a direction to explore regardless of the debate we're having about .clear actually since developers won't necessarily always use it and the GC needs to be fast in these cases too.
As I said earlier, I basically agree with that.
Just to be sure I understand generational GC: old generations considered as roots when doing most GC traversing, right? That's why it may take time to realize they're actually dead?
Uh… I am going to screw this up, but I think basically if you're collecting generation k, all objects in generations ≤k that are referenced from generations >k are treated as roots.
Which way are you numbering your generations? Are generations >k
younger or older than k? (I ask because, if your answer is flipped from what I expect, you are correctly describing "remembered sets".)
[Merging a couple of relevant posts]
Le 22/01/2013 15:59, Jason Orendorff a écrit :
That would be:
For reference, quote from Allen:
Jason:
Indeed, the long-lived object being the only root of a big graph is a problem unrelated to WeakMaps. If that was the only reason to add a clear method on weakmaps, an Object.clear should be considered too. I don't understand the point about the worst-case GC performance. It may be related to Allen's point about ephemeron algorithms which I know not enough about. I would be interested in knowing more if that's relevant. I'm not entirely sure it's relevant since the difference between .clear and dropping a weakmap is about the delta during which the storage is considered collectable.
What you said above about the current GC setup that yields equivalence performance to .clear is interesting. In a nutshell, moving to a (naive?) generational GC means that you're losing something you had before. I feel there is a middleground to be found. What about the following: WeakMaps are allocated in their own area which is manually GC'ed with today's algorithm (which is probably implemented for the last generation?). This way, you'll know as soon as possible (next GC) if one is dead. Variations:
I wouldn't consider what I suggested as a hack around worst-case GC performance, but rather as WeakMap special-casing. Given WeakMaps special properties about memory management, it doesn't sound that crazy to special case how they're being used in GC algorithms. Maybe the ideas I suggested above in 5 minutes are not perfect, but I feel special-casing weakmaps is a direction to explore regardless of the debate we're having about .clear actually since developers won't necessarily always use it and the GC needs to be fast in these cases too.
Just to be sure I understand generational GC: old generations considered as roots when doing most GC traversing, right? That's why it may take time to realize they're actually dead?
Thanks,