WeakMap.prototype.clear performance

# David Bruant (13 years ago)

Thanks a lot for these explanations! (Answer below)

Le 22/01/2013 22:46, Jason Orendorff a écrit :

Having said all that, I bet we could hack around the worst-case
GC performance. It'll be a pain, but GC is like that sometimes.
What you said above about the current GC setup that yields
equivalence performance to .clear is interesting. In a nutshell,
moving to a (naive?) generational GC means that you're losing
something you had before. I feel there is a middleground to be found.
What you're losing when you switch to a generational GC is precision. The tradeoff is: you do a lot less GC work, but you collect objects that survive a generation much less aggressively.
What about the following:
WeakMaps are allocated in their own area which is manually GC'ed
with today's algorithm (which is probably implemented for the last
generation?). This way, you'll know as soon as possible (next GC)
if one is dead.
I don't understand. How do you know if one is dead, short of marking the entire heap?

I understand the problem now. An "old" object may hold a reference to a weakmap and you can't know until the entire heap has been marked which happens less often. Specifically, in Mark's pattern, the WeakMapWithClear object outlives the weakmaps it encapsulates and until you've found out that this object is dead, the formerly encapsulated weakmap can't be declared as dead.

So, to find out if a weakmap is dead, it has to come from another source than the mark-and-sweep algorithm (since it losts its precision)... Given the additional prohibitive cost weakmaps seem to have on the GC, maybe things that would otherwise be considered too costly could make sense to be applied specifically to WeakMaps. For instance, would the cost of reference-counting only weakmaps be worth the benefit from knowing early that the weakmap is dead? (I have no idea how much each costs, so it's hard for me to compare the costs) For WeakMapWithClear, reference counting would declare the weakmap dead as soon as the new weakmap is assigned to the private property so that's good. It wouldn't work if some weakmaps are part of a cycle of course... but maybe that it's such an edge case that it's acceptable to ask users doing that to break their weakmaps cycle manually if they don't want the GC not to be too mad at them.

Thanks again for the clarifications!

Thanks a lot for these explanations! (Answer below)

Le 22/01/2013 22:46, Jason Orendorff a écrit :
>
>>     Having said all that, I bet we could hack around the worst-case
>>     GC performance. It'll be a pain, but GC is like that sometimes.
>     What you said above about the current GC setup that yields
>     equivalence performance to .clear is interesting. In a nutshell,
>     moving to a (naive?) generational GC means that you're losing
>     something you had before. I feel there is a middleground to be found.
>
>
> What you're losing when you switch to a generational GC is precision. 
> The tradeoff is: you do a lot less GC work, but you collect objects 
> that survive a generation much less aggressively.
>
>     What about the following:
>     WeakMaps are allocated in their own area which is manually GC'ed
>     with today's algorithm (which is probably implemented for the last
>     generation?). This way, you'll know as soon as possible (next GC)
>     if one is dead.
>
>
> I don't understand. How do you know if one is dead, short of marking 
> the entire heap?
I understand the problem now. An "old" object may hold a reference to a 
weakmap and you can't know until the entire heap has been marked which 
happens less often.
Specifically, in Mark's pattern, the WeakMapWithClear object outlives 
the weakmaps it encapsulates and until you've found out that this object 
is dead, the formerly encapsulated weakmap can't be declared as dead.

So, to find out if a weakmap is dead, it has to come from another source 
than the mark-and-sweep algorithm (since it losts its precision)...
Given the additional prohibitive cost weakmaps seem to have on the GC, 
maybe things that would otherwise be considered too costly could make 
sense to be applied specifically to WeakMaps. For instance, would the 
cost of reference-counting only weakmaps be worth the benefit from 
knowing early that the weakmap is dead? (I have no idea how much each 
costs, so it's hard for me to compare the costs)
For WeakMapWithClear, reference counting would declare the weakmap dead 
as soon as the new weakmap is assigned to the private property so that's 
good. It wouldn't work if some weakmaps are part of a cycle of course... 
but maybe that it's such an edge case that it's acceptable to ask users 
doing that to break their weakmaps cycle manually if they don't want the 
GC not to be too mad at them.

Thanks again for the clarifications!

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130122/89d21e03/attachment.html>

# Allen Wirfs-Brock (13 years ago)

On Jan 22, 2013, at 2:35 PM, David Bruant wrote:

So, to find out if a weakmap is dead, it has to come from another source than the mark-and-sweep algorithm (since it losts its precision)... Given the additional prohibitive cost weakmaps seem to have on the GC, maybe things that would otherwise be considered too costly could make sense to be applied specifically to WeakMaps. For instance, would the cost of reference-counting only weakmaps be worth the benefit from knowing early that the weakmap is dead? (I have no idea how much each costs, so it's hard for me to compare the costs) For WeakMapWithClear, reference counting would declare the weakmap dead as soon as the new weakmap is assigned to the private property so that's good. It wouldn't work if some weakmaps are part of a cycle of course... but maybe that it's such an edge case that it's acceptable to ask users doing that to break their weakmaps cycle manually if they don't want the GC not to be too mad at them.

You know, as much as Jason and I enjoy talking about garbage collectors, this probably isn't the place to revisit the last 40 years of a highly developed area of specialized CS technology.

We can understand the value of providing a clear method without talking about GC at all. Map and WeakMap are examples of object abstractions that encapsulate a probably complex implementation level data structure. As ES programmers we aren't allowed to know very much about the specifics of those data structures other than that the spec. says they must provide average access times that are sublinear with respect to the number of entries in the collection. The implementation is probably some sort of hash table, but it could be based upon b-trees or some other appropriate data structure. An implementation might even dynamically change data structures and algorithms as elements are added or removed.

Regardless, as ES programmers that is all forbidden knowledge. All we can know and express are things that are exposed through the abstraction's API. An operation like clear, even those it might be possible to achieve the same abstract end result using other operators like delete, allows us to express an intent that tunnels through the abstraction boundary. For a large number of elements, it is quiet likely that the underlying data structure can be cleared or reset to no entries far more efficiently than performing an sequence of isolated deletes for the total number of elements. But if the clear method does it exist we can't express that intent and hence the optimization is impossible.

On Jan 22, 2013, at 2:35 PM, David Bruant wrote:

> 
> So, to find out if a weakmap is dead, it has to come from another source than the mark-and-sweep algorithm (since it losts its precision)...
> Given the additional prohibitive cost weakmaps seem to have on the GC, maybe things that would otherwise be considered too costly could make sense to be applied specifically to WeakMaps. For instance, would the cost of reference-counting only weakmaps be worth the benefit from knowing early that the weakmap is dead? (I have no idea how much each costs, so it's hard for me to compare the costs)
> For WeakMapWithClear, reference counting would declare the weakmap dead as soon as the new weakmap is assigned to the private property so that's good. It wouldn't work if some weakmaps are part of a cycle of course... but maybe that it's such an edge case that it's acceptable to ask users doing that to break their weakmaps cycle manually if they don't want the GC not to be too mad at them.
> 

You know, as much as Jason and I enjoy talking about garbage collectors, this probably isn't the place to revisit the last 40 years of a highly developed area of specialized CS technology.  

We can understand the value of providing a clear method without talking about GC at all.  Map and WeakMap are examples of object abstractions that encapsulate a probably complex implementation level data structure.  As ES programmers we aren't allowed to know very much about the specifics of those data structures other than that the spec. says they must provide average access times that are sublinear with respect to the number of entries in the collection.  The implementation is probably some sort of hash table, but it could be based upon b-trees or some other appropriate data structure.  An implementation might even dynamically change data structures and algorithms as elements are added or removed.

Regardless, as ES programmers that is all forbidden knowledge.  All we  can know and express are things that are exposed through the abstraction's API. An operation like clear, even those it might be possible to achieve the same abstract end result using other operators like delete, allows us to express an intent that tunnels through the abstraction boundary.  For a large number of elements, it is quiet likely that the underlying data structure can be cleared or reset to no entries far more efficiently than  performing an sequence of isolated deletes for the total number of elements.  But if the clear method does it exist we can't express that intent and hence the optimization is impossible.

Allen

# Mark S. Miller (13 years ago)

Should Symbols (whether unique or private) also have a .clear method, in order to delete all properties named by that symbol?

Should Symbols (whether unique or private) also have a .clear method,
in order to delete all properties named by that symbol?

On Tue, Jan 22, 2013 at 3:36 PM, Allen Wirfs-Brock
<allen at wirfs-brock.com> wrote:
>
> On Jan 22, 2013, at 2:35 PM, David Bruant wrote:
>
>>
>> So, to find out if a weakmap is dead, it has to come from another source than the mark-and-sweep algorithm (since it losts its precision)...
>> Given the additional prohibitive cost weakmaps seem to have on the GC, maybe things that would otherwise be considered too costly could make sense to be applied specifically to WeakMaps. For instance, would the cost of reference-counting only weakmaps be worth the benefit from knowing early that the weakmap is dead? (I have no idea how much each costs, so it's hard for me to compare the costs)
>> For WeakMapWithClear, reference counting would declare the weakmap dead as soon as the new weakmap is assigned to the private property so that's good. It wouldn't work if some weakmaps are part of a cycle of course... but maybe that it's such an edge case that it's acceptable to ask users doing that to break their weakmaps cycle manually if they don't want the GC not to be too mad at them.
>>
>
> You know, as much as Jason and I enjoy talking about garbage collectors, this probably isn't the place to revisit the last 40 years of a highly developed area of specialized CS technology.
>
> We can understand the value of providing a clear method without talking about GC at all.  Map and WeakMap are examples of object abstractions that encapsulate a probably complex implementation level data structure.  As ES programmers we aren't allowed to know very much about the specifics of those data structures other than that the spec. says they must provide average access times that are sublinear with respect to the number of entries in the collection.  The implementation is probably some sort of hash table, but it could be based upon b-trees or some other appropriate data structure.  An implementation might even dynamically change data structures and algorithms as elements are added or removed.
>
> Regardless, as ES programmers that is all forbidden knowledge.  All we  can know and express are things that are exposed through the abstraction's API. An operation like clear, even those it might be possible to achieve the same abstract end result using other operators like delete, allows us to express an intent that tunnels through the abstraction boundary.  For a large number of elements, it is quiet likely that the underlying data structure can be cleared or reset to no entries far more efficiently than  performing an sequence of isolated deletes for the total number of elements.  But if the clear method does it exist we can't express that intent and hence the optimization is impossible.
>
> Allen
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss



-- 
    Cheers,
    --MarkM

# Allen Wirfs-Brock (13 years ago)

On Jan 22, 2013, at 4:00 PM, Mark S. Miller wrote:

Should Symbols (whether unique or private) also have a .clear method, in order to delete all properties named by that symbol?

Of course not. A Symbol is basically a simple immutable value disguised as an object. The is no interesting abstraction layer that to communicate across. A more appropriate question might be should there be an Object.deleteAll(obj) function that tries to delete all own properties on an object. As far as I know there is no usage experience that suggests that this is needed In contrast, there is plenty of experience that suggests that "flushing a cache" is a common operations and building caches is certainly a use case for WeakMap.

On Jan 22, 2013, at 4:00 PM, Mark S. Miller wrote:

> Should Symbols (whether unique or private) also have a .clear method,
> in order to delete all properties named by that symbol?

Of course not.  A Symbol is basically a simple immutable value disguised as an object.  The is no interesting abstraction layer that to communicate across. A more appropriate question might be should there be an Object.deleteAll(obj) function that tries to delete all own properties on an object.  As far as I know there is no usage experience that suggests that this is  needed  In contrast, there is plenty of experience that suggests that "flushing a cache" is a common operations and building caches is certainly a use case for WeakMap.

Allen



> 
> On Tue, Jan 22, 2013 at 3:36 PM, Allen Wirfs-Brock
> <allen at wirfs-brock.com> wrote:
>> 
>> On Jan 22, 2013, at 2:35 PM, David Bruant wrote:
>> 
>>> 
>>> So, to find out if a weakmap is dead, it has to come from another source than the mark-and-sweep algorithm (since it losts its precision)...
>>> Given the additional prohibitive cost weakmaps seem to have on the GC, maybe things that would otherwise be considered too costly could make sense to be applied specifically to WeakMaps. For instance, would the cost of reference-counting only weakmaps be worth the benefit from knowing early that the weakmap is dead? (I have no idea how much each costs, so it's hard for me to compare the costs)
>>> For WeakMapWithClear, reference counting would declare the weakmap dead as soon as the new weakmap is assigned to the private property so that's good. It wouldn't work if some weakmaps are part of a cycle of course... but maybe that it's such an edge case that it's acceptable to ask users doing that to break their weakmaps cycle manually if they don't want the GC not to be too mad at them.
>>> 
>> 
>> You know, as much as Jason and I enjoy talking about garbage collectors, this probably isn't the place to revisit the last 40 years of a highly developed area of specialized CS technology.
>> 
>> We can understand the value of providing a clear method without talking about GC at all.  Map and WeakMap are examples of object abstractions that encapsulate a probably complex implementation level data structure.  As ES programmers we aren't allowed to know very much about the specifics of those data structures other than that the spec. says they must provide average access times that are sublinear with respect to the number of entries in the collection.  The implementation is probably some sort of hash table, but it could be based upon b-trees or some other appropriate data structure.  An implementation might even dynamically change data structures and algorithms as elements are added or removed.
>> 
>> Regardless, as ES programmers that is all forbidden knowledge.  All we  can know and express are things that are exposed through the abstraction's API. An operation like clear, even those it might be possible to achieve the same abstract end result using other operators like delete, allows us to express an intent that tunnels through the abstraction boundary.  For a large number of elements, it is quiet likely that the underlying data structure can be cleared or reset to no entries far more efficiently than  performing an sequence of isolated deletes for the total number of elements.  But if the clear method does it exist we can't express that intent and hence the optimization is impossible.
>> 
>> Allen
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
> 
> 
> 
> -- 
>    Cheers,
>    --MarkM
>

# Herby Vojčík (13 years ago)

Allen Wirfs-Brock wrote:

On Jan 22, 2013, at 4:00 PM, Mark S. Miller wrote:

Should Symbols (whether unique or private) also have a .clear method, in order to delete all properties named by that symbol?

Of course not. A Symbol is basically a simple immutable value disguised as an object. The is no interesting abstraction layer that

But there is a case for this. Very similar to your cache flushing, I admit. Call it unmarking, for example. Symbols (not only private ones) are used to monkey-patch object with some "alternative-layer" of properties, which some side-process uses. This side-process may use it only for some time, afterwards it may want to just clean after itself. This is exactly where I want symbols.clear().

Yes, you can say it used bad tool and should have used WeakMap instead...

Allen Wirfs-Brock wrote:
> On Jan 22, 2013, at 4:00 PM, Mark S. Miller wrote:
>
>> Should Symbols (whether unique or private) also have a .clear
>> method, in order to delete all properties named by that symbol?
>
> Of course not.  A Symbol is basically a simple immutable value
> disguised as an object.  The is no interesting abstraction layer that

But there is a case for this. Very similar to your cache flushing, I 
admit. Call it unmarking, for example. Symbols (not only private ones) 
are used to monkey-patch object with some "alternative-layer" of 
properties, which some side-process uses. This side-process may use it 
only for some time, afterwards it may want to just clean after itself. 
This is exactly where I want `symbols.clear()`.

Yes, you can say it used bad tool and should have used WeakMap instead...

Herby

> to communicate across. A more appropriate question might be should
> there be an Object.deleteAll(obj) function that tries to delete all
> own properties on an object.  As far as I know there is no usage
> experience that suggests that this is  needed  In contrast, there is
> plenty of experience that suggests that "flushing a cache" is a
> common operations and building caches is certainly a use case for
> WeakMap.
>
> Allen
>
>
>
>> On Tue, Jan 22, 2013 at 3:36 PM, Allen Wirfs-Brock
>> <allen at wirfs-brock.com>  wrote:
>>> On Jan 22, 2013, at 2:35 PM, David Bruant wrote:
>>>
>>>> So, to find out if a weakmap is dead, it has to come from
>>>> another source than the mark-and-sweep algorithm (since it
>>>> losts its precision)... Given the additional prohibitive cost
>>>> weakmaps seem to have on the GC, maybe things that would
>>>> otherwise be considered too costly could make sense to be
>>>> applied specifically to WeakMaps. For instance, would the cost
>>>> of reference-counting only weakmaps be worth the benefit from
>>>> knowing early that the weakmap is dead? (I have no idea how
>>>> much each costs, so it's hard for me to compare the costs) For
>>>> WeakMapWithClear, reference counting would declare the weakmap
>>>> dead as soon as the new weakmap is assigned to the private
>>>> property so that's good. It wouldn't work if some weakmaps are
>>>> part of a cycle of course... but maybe that it's such an edge
>>>> case that it's acceptable to ask users doing that to break
>>>> their weakmaps cycle manually if they don't want the GC not to
>>>> be too mad at them.
>>>>
>>> You know, as much as Jason and I enjoy talking about garbage
>>> collectors, this probably isn't the place to revisit the last 40
>>> years of a highly developed area of specialized CS technology.
>>>
>>> We can understand the value of providing a clear method without
>>> talking about GC at all.  Map and WeakMap are examples of object
>>> abstractions that encapsulate a probably complex implementation
>>> level data structure.  As ES programmers we aren't allowed to
>>> know very much about the specifics of those data structures other
>>> than that the spec. says they must provide average access times
>>> that are sublinear with respect to the number of entries in the
>>> collection.  The implementation is probably some sort of hash
>>> table, but it could be based upon b-trees or some other
>>> appropriate data structure.  An implementation might even
>>> dynamically change data structures and algorithms as elements are
>>> added or removed.
>>>
>>> Regardless, as ES programmers that is all forbidden knowledge.
>>> All we  can know and express are things that are exposed through
>>> the abstraction's API. An operation like clear, even those it
>>> might be possible to achieve the same abstract end result using
>>> other operators like delete, allows us to express an intent that
>>> tunnels through the abstraction boundary.  For a large number of
>>> elements, it is quiet likely that the underlying data structure
>>> can be cleared or reset to no entries far more efficiently than
>>> performing an sequence of isolated deletes for the total number
>>> of elements.  But if the clear method does it exist we can't
>>> express that intent and hence the optimization is impossible.
>>>
>>> Allen
>> -- Cheers, --MarkM