WeakMap not the weak needed for zombie views

# Peter Michaux (11 years ago)

I've been reading about WeakMap in the draft. To my surprise, it is not at all what I thought it would be or what I was hoping to use. At least that is my understanding.

My use case is in MV* architectures. With current MV* frameworks, a model holds strong references to the views observing that model. If a view is removed from the DOM, all other references in the application are lost to that view, but the view never stopped observing the model object, that strong reference from model to view results in a zombie view. Avoiding this means views need to have destroy methods that unsubscribes the view from the model. It is easy for the application programmer to forget to call a view's destroy method and the application leaks memory. As a result of the leak, the user experience and ultimately the reputation of the Web suffers. If a model could hold weak references to its observers, this would safeguard against accidental and inevitable application programmer forgetfulness.

It appears that WeakMap cannot help solve the current MV* zombie view problem. Or did I miss something?

I was expecting WeakMap to hold its values weakly and set them to undefined or delete the associated key when the value was garbage collected.

Does anything exist or is coming to help solve the zombie problem?

Smalltalk Squeak models use a WeakIdentityKeyDictionary which holds its keys weakly. The difference compared with the ECMAScript WeakMap is that instances of WeakIdentityKeyDictionary have an iterator so the observers can be stored as the keys and still discoverable without keeping other strong references. The ECMAScript standard specifically disallows in iterator.

Hi,

I've been reading about WeakMap in the draft. To my surprise, it is
not at all what I thought it would be or what I was hoping to use. At
least that is my understanding.

My use case is in MV* architectures. With current MV* frameworks, a
model holds strong references to the views observing that model. If a
view is removed from the DOM, all other references in the application
are lost to that view, but the view never stopped observing the model
object, that strong reference from model to view results in a zombie
view. Avoiding this means views need to have `destroy` methods that
unsubscribes the view from the model. It is easy for the application
programmer to forget to call a view's `destroy` method and the
application leaks memory. As a result of the leak, the user experience
and ultimately the reputation of the Web suffers. If a model could
hold weak references to its observers, this would safeguard against
accidental and inevitable application programmer forgetfulness.

It appears that WeakMap cannot help solve the current MV* zombie view
problem. Or did I miss something?

I was expecting WeakMap to hold its values weakly and set them to
undefined or delete the associated key when the value was garbage
collected.

Does anything exist or is coming to help solve the zombie problem?

----

Smalltalk Squeak models use a WeakIdentityKeyDictionary which holds
its keys weakly. The difference compared with the ECMAScript WeakMap
is that instances of WeakIdentityKeyDictionary have an iterator so the
observers can be stored as the keys and still discoverable without
keeping other strong references. The ECMAScript standard specifically
disallows in iterator.

Thanks,
Peter

# Till Schneidereit (11 years ago)

There is an ES7 proposal for weak references that would satisfy your requirements. However, at least at Mozilla there is very strong opposition to this from people working on the memory management subsystems (i.e. the GC and CC). It's not clear to me that their arguments have been defeated and I'm not aware of any more recent discussions about this topic than those on Mozilla's platform development mailing list 2, 3.

While I think that weak references are an important feature, I don't think this particular use case is a good argument for them: in my personal experience working with and implementing systems like you describe, weak listeners were eventually deprecated and replaced by forced explicity unsubscription every time. If a view is destroyed, you really don't want it to receive any events anymore, regardless of the GC's timing. Now you could say that in the framework's event dispatching or handling mechanism you can detect this situation. If so, you can also just unsubscribe a strongly-held event listener at that point.

There is an ES7 proposal for weak references[1] that would satisfy your
requirements. However, at least at Mozilla there is very strong opposition
to this from people working on the memory management subsystems (i.e. the
GC and CC). It's not clear to me that their arguments have been defeated
and I'm not aware of any more recent discussions about this topic than
those on Mozilla's platform development mailing list[2][3].

While I think that weak references are an important feature, I don't think
this particular use case is a good argument for them: in my personal
experience working with and implementing systems like you describe, weak
listeners were eventually deprecated and replaced by forced explicity
unsubscription every time. If a view is destroyed, you really don't want it
to receive any events anymore, regardless of the GC's timing. Now you could
say that in the framework's event dispatching or handling mechanism you can
detect this situation. If so, you can also just unsubscribe a strongly-held
event listener at that point.

[1]: http://wiki.ecmascript.org/doku.php?id=strawman:weakreferences
[2]:
https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/V__5zqll3zc
[3]:
https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/9LcqR9m5Mo4

On Sun, Jul 6, 2014 at 7:00 AM, Peter Michaux <petermichaux at gmail.com>
wrote:

> Hi,
>
> I've been reading about WeakMap in the draft. To my surprise, it is
> not at all what I thought it would be or what I was hoping to use. At
> least that is my understanding.
>
> My use case is in MV* architectures. With current MV* frameworks, a
> model holds strong references to the views observing that model. If a
> view is removed from the DOM, all other references in the application
> are lost to that view, but the view never stopped observing the model
> object, that strong reference from model to view results in a zombie
> view. Avoiding this means views need to have `destroy` methods that
> unsubscribes the view from the model. It is easy for the application
> programmer to forget to call a view's `destroy` method and the
> application leaks memory. As a result of the leak, the user experience
> and ultimately the reputation of the Web suffers. If a model could
> hold weak references to its observers, this would safeguard against
> accidental and inevitable application programmer forgetfulness.
>
> It appears that WeakMap cannot help solve the current MV* zombie view
> problem. Or did I miss something?
>
> I was expecting WeakMap to hold its values weakly and set them to
> undefined or delete the associated key when the value was garbage
> collected.
>
> Does anything exist or is coming to help solve the zombie problem?
>
> ----
>
> Smalltalk Squeak models use a WeakIdentityKeyDictionary which holds
> its keys weakly. The difference compared with the ECMAScript WeakMap
> is that instances of WeakIdentityKeyDictionary have an iterator so the
> observers can be stored as the keys and still discoverable without
> keeping other strong references. The ECMAScript standard specifically
> disallows in iterator.
>
> Thanks,
> Peter
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140706/dad40c8c/attachment.html>

# Katelyn Gadd (11 years ago)

There are some fairly recent es-discuss threads about weak references. I don't know if there is a consensus yet (it is very hard to tell) but some people like Brendan are on record that there are real use cases that require them. I've been pushing hard for them, along with the author of embind. The question is mostly whether solving those problems is worth the cost of exposing GC to content JS (though, if memory serves, there was a claim in one of the discussion threads that you can implement weakrefs without exposing GC - I'm not sure if that was an 'I've figured it out' statement or just a hypothesis).

At present I don't believe WRs will ever make it onto the open web. It seems like there's a huge amount of resistance to the idea that is never going to go away, so any application that needs them is best-served by an emscripten-style heap and manually implemented collector (boehm, etc). You could probably achieve some sort of compromise by writing your own user-space collector that walks the reachable JS heap if you manage to root your JS objects correctly, but I suspect that would only be viable as a codegen strategy for a compiler targeting JS, not something you'd do by hand when writing JS.

For some WR use cases you can probably do manual refcounting yourself, as long as you do it right - you'd want to replace the actual object references with a lightweight 'handle' object that forwards onto the real instance via a handle lookup, so that you can 'collect' the actual instance without requiring the handles to die.

There are some fairly recent es-discuss threads about weak references.
I don't know if there is a consensus yet (it is very hard to tell) but
some people like Brendan are on record that there are real use cases
that require them. I've been pushing hard for them, along with the
author of embind. The question is mostly whether solving those
problems is worth the cost of exposing GC to content JS (though, if
memory serves, there was a claim in one of the discussion threads that
you can implement weakrefs without exposing GC - I'm not sure if that
was an 'I've figured it out' statement or just a hypothesis).

At present I don't believe WRs will ever make it onto the open web. It
seems like there's a huge amount of resistance to the idea that is
never going to go away, so any application that needs them is
best-served by an emscripten-style heap and manually implemented
collector (boehm, etc). You could probably achieve some sort of
compromise by writing your own user-space collector that walks the
reachable JS heap if you manage to root your JS objects correctly, but
I suspect that would only be viable as a codegen strategy for a
compiler targeting JS, not something you'd do by hand when writing JS.

For some WR use cases you can probably do manual refcounting yourself,
as long as you do it right - you'd want to replace the actual object
references with a lightweight 'handle' object that forwards onto the
real instance via a handle lookup, so that you can 'collect' the
actual instance without requiring the handles to die.

On Sun, Jul 6, 2014 at 3:17 AM, Till Schneidereit
<till at tillschneidereit.net> wrote:
> There is an ES7 proposal for weak references[1] that would satisfy your
> requirements. However, at least at Mozilla there is very strong opposition
> to this from people working on the memory management subsystems (i.e. the GC
> and CC). It's not clear to me that their arguments have been defeated and
> I'm not aware of any more recent discussions about this topic than those on
> Mozilla's platform development mailing list[2][3].
>
> While I think that weak references are an important feature, I don't think
> this particular use case is a good argument for them: in my personal
> experience working with and implementing systems like you describe, weak
> listeners were eventually deprecated and replaced by forced explicity
> unsubscription every time. If a view is destroyed, you really don't want it
> to receive any events anymore, regardless of the GC's timing. Now you could
> say that in the framework's event dispatching or handling mechanism you can
> detect this situation. If so, you can also just unsubscribe a strongly-held
> event listener at that point.
>
>
> [1]: http://wiki.ecmascript.org/doku.php?id=strawman:weakreferences
> [2]:
> https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/V__5zqll3zc
> [3]:
> https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/9LcqR9m5Mo4
>
>
>
> On Sun, Jul 6, 2014 at 7:00 AM, Peter Michaux <petermichaux at gmail.com>
> wrote:
>>
>> Hi,
>>
>> I've been reading about WeakMap in the draft. To my surprise, it is
>> not at all what I thought it would be or what I was hoping to use. At
>> least that is my understanding.
>>
>> My use case is in MV* architectures. With current MV* frameworks, a
>> model holds strong references to the views observing that model. If a
>> view is removed from the DOM, all other references in the application
>> are lost to that view, but the view never stopped observing the model
>> object, that strong reference from model to view results in a zombie
>> view. Avoiding this means views need to have `destroy` methods that
>> unsubscribes the view from the model. It is easy for the application
>> programmer to forget to call a view's `destroy` method and the
>> application leaks memory. As a result of the leak, the user experience
>> and ultimately the reputation of the Web suffers. If a model could
>> hold weak references to its observers, this would safeguard against
>> accidental and inevitable application programmer forgetfulness.
>>
>> It appears that WeakMap cannot help solve the current MV* zombie view
>> problem. Or did I miss something?
>>
>> I was expecting WeakMap to hold its values weakly and set them to
>> undefined or delete the associated key when the value was garbage
>> collected.
>>
>> Does anything exist or is coming to help solve the zombie problem?
>>
>> ----
>>
>> Smalltalk Squeak models use a WeakIdentityKeyDictionary which holds
>> its keys weakly. The difference compared with the ECMAScript WeakMap
>> is that instances of WeakIdentityKeyDictionary have an iterator so the
>> observers can be stored as the keys and still discoverable without
>> keeping other strong references. The ECMAScript standard specifically
>> disallows in iterator.
>>
>> Thanks,
>> Peter
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>

# Mark S. Miller (11 years ago)

On Sun, Jul 6, 2014 at 7:47 AM, Katelyn Gadd <kg at luminance.org> wrote:

There are some fairly recent es-discuss threads about weak references. I don't know if there is a consensus yet (it is very hard to tell) but some people like Brendan are on record that there are real use cases that require them. I've been pushing hard for them, along with the author of embind. The question is mostly whether solving those problems is worth the cost of exposing GC to content JS (though, if memory serves, there was a claim in one of the discussion threads that you can implement weakrefs without exposing GC - I'm not sure if that was an 'I've figured it out' statement or just a hypothesis).

I would be very curious. It seems impossible by definition. Could you (or anyone) please try to find this? Thanks.

What I have claimed is that we can isolate the communications channel that this provides in ways that make it a reasonable (IMO) security risk. Perhaps this is what you are thinking of?

On Sun, Jul 6, 2014 at 7:47 AM, Katelyn Gadd <kg at luminance.org> wrote:

> There are some fairly recent es-discuss threads about weak references.
> I don't know if there is a consensus yet (it is very hard to tell) but
> some people like Brendan are on record that there are real use cases
> that require them. I've been pushing hard for them, along with the
> author of embind. The question is mostly whether solving those
> problems is worth the cost of exposing GC to content JS (though, if
> memory serves, there was a claim in one of the discussion threads that
> you can implement weakrefs without exposing GC - I'm not sure if that
> was an 'I've figured it out' statement or just a hypothesis).
>

I would be very curious. It seems impossible by definition. Could you (or
anyone) please try to find this? Thanks.

What I have claimed is that we can isolate the communications channel that
this provides in ways that make it a reasonable (IMO) security risk.
Perhaps this is what you are thinking of?


>
> At present I don't believe WRs will ever make it onto the open web. It
> seems like there's a huge amount of resistance to the idea that is
> never going to go away, so any application that needs them is
> best-served by an emscripten-style heap and manually implemented
> collector (boehm, etc). You could probably achieve some sort of
> compromise by writing your own user-space collector that walks the
> reachable JS heap if you manage to root your JS objects correctly, but
> I suspect that would only be viable as a codegen strategy for a
> compiler targeting JS, not something you'd do by hand when writing JS.
>
> For some WR use cases you can probably do manual refcounting yourself,
> as long as you do it right - you'd want to replace the actual object
> references with a lightweight 'handle' object that forwards onto the
> real instance via a handle lookup, so that you can 'collect' the
> actual instance without requiring the handles to die.
>
> On Sun, Jul 6, 2014 at 3:17 AM, Till Schneidereit
> <till at tillschneidereit.net> wrote:
> > There is an ES7 proposal for weak references[1] that would satisfy your
> > requirements. However, at least at Mozilla there is very strong
> opposition
> > to this from people working on the memory management subsystems (i.e.
> the GC
> > and CC). It's not clear to me that their arguments have been defeated and
> > I'm not aware of any more recent discussions about this topic than those
> on
> > Mozilla's platform development mailing list[2][3].
> >
> > While I think that weak references are an important feature, I don't
> think
> > this particular use case is a good argument for them: in my personal
> > experience working with and implementing systems like you describe, weak
> > listeners were eventually deprecated and replaced by forced explicity
> > unsubscription every time. If a view is destroyed, you really don't want
> it
> > to receive any events anymore, regardless of the GC's timing. Now you
> could
> > say that in the framework's event dispatching or handling mechanism you
> can
> > detect this situation. If so, you can also just unsubscribe a
> strongly-held
> > event listener at that point.
> >
> >
> > [1]: http://wiki.ecmascript.org/doku.php?id=strawman:weakreferences
> > [2]:
> >
> https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/V__5zqll3zc
> > [3]:
> >
> https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/9LcqR9m5Mo4
> >
> >
> >
> > On Sun, Jul 6, 2014 at 7:00 AM, Peter Michaux <petermichaux at gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> I've been reading about WeakMap in the draft. To my surprise, it is
> >> not at all what I thought it would be or what I was hoping to use. At
> >> least that is my understanding.
> >>
> >> My use case is in MV* architectures. With current MV* frameworks, a
> >> model holds strong references to the views observing that model. If a
> >> view is removed from the DOM, all other references in the application
> >> are lost to that view, but the view never stopped observing the model
> >> object, that strong reference from model to view results in a zombie
> >> view. Avoiding this means views need to have `destroy` methods that
> >> unsubscribes the view from the model. It is easy for the application
> >> programmer to forget to call a view's `destroy` method and the
> >> application leaks memory. As a result of the leak, the user experience
> >> and ultimately the reputation of the Web suffers. If a model could
> >> hold weak references to its observers, this would safeguard against
> >> accidental and inevitable application programmer forgetfulness.
> >>
> >> It appears that WeakMap cannot help solve the current MV* zombie view
> >> problem. Or did I miss something?
> >>
> >> I was expecting WeakMap to hold its values weakly and set them to
> >> undefined or delete the associated key when the value was garbage
> >> collected.
> >>
> >> Does anything exist or is coming to help solve the zombie problem?
> >>
> >> ----
> >>
> >> Smalltalk Squeak models use a WeakIdentityKeyDictionary which holds
> >> its keys weakly. The difference compared with the ECMAScript WeakMap
> >> is that instances of WeakIdentityKeyDictionary have an iterator so the
> >> observers can be stored as the keys and still discoverable without
> >> keeping other strong references. The ECMAScript standard specifically
> >> disallows in iterator.
> >>
> >> Thanks,
> >> Peter
> >> _______________________________________________
> >> es-discuss mailing list
> >> es-discuss at mozilla.org
> >> https://mail.mozilla.org/listinfo/es-discuss
> >
> >
> >
> > _______________________________________________
> > es-discuss mailing list
> > es-discuss at mozilla.org
> > https://mail.mozilla.org/listinfo/es-discuss
> >
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>



-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140706/144350f9/attachment.html>

# Filip Pizlo (11 years ago)

I've read this exchange and might be missing context. I'm intrigued by it and want to know more.

Is the main opposition to weak references just the security implications of information revealed by GC? Has anyone quantified how much information is leaked, or proved that this information cannot be obtained through already exposed APIs or language features? I presume it has something to do with detecting if anyone else has a reference to an object.

> On Jul 6, 2014, at 9:30 AM, "Mark S. Miller" <erights at google.com> wrote:
> 
>> On Sun, Jul 6, 2014 at 7:47 AM, Katelyn Gadd <kg at luminance.org> wrote:
>> There are some fairly recent es-discuss threads about weak references.
>> I don't know if there is a consensus yet (it is very hard to tell) but
>> some people like Brendan are on record that there are real use cases
>> that require them. I've been pushing hard for them, along with the
>> author of embind. The question is mostly whether solving those
>> problems is worth the cost of exposing GC to content JS (though, if
>> memory serves, there was a claim in one of the discussion threads that
>> you can implement weakrefs without exposing GC - I'm not sure if that
>> was an 'I've figured it out' statement or just a hypothesis).
> 
> I would be very curious. It seems impossible by definition. Could you (or anyone) please try to find this? Thanks.
> 
> What I have claimed is that we can isolate the communications channel that this provides in ways that make it a reasonable (IMO) security risk. Perhaps this is what you are thinking of?

I've read this exchange and might be missing context. I'm intrigued by it and want to know more. 

Is the main opposition to weak references just the security implications of information revealed by GC?  Has anyone quantified how much information is leaked, or proved that this information cannot be obtained through already exposed APIs or language features?  I presume it has something to do with detecting if anyone else has a reference to an object. 

-Filip

>  
>> 
>> At present I don't believe WRs will ever make it onto the open web. It
>> seems like there's a huge amount of resistance to the idea that is
>> never going to go away, so any application that needs them is
>> best-served by an emscripten-style heap and manually implemented
>> collector (boehm, etc). You could probably achieve some sort of
>> compromise by writing your own user-space collector that walks the
>> reachable JS heap if you manage to root your JS objects correctly, but
>> I suspect that would only be viable as a codegen strategy for a
>> compiler targeting JS, not something you'd do by hand when writing JS.
>> 
>> For some WR use cases you can probably do manual refcounting yourself,
>> as long as you do it right - you'd want to replace the actual object
>> references with a lightweight 'handle' object that forwards onto the
>> real instance via a handle lookup, so that you can 'collect' the
>> actual instance without requiring the handles to die.
>> 
>> On Sun, Jul 6, 2014 at 3:17 AM, Till Schneidereit
>> <till at tillschneidereit.net> wrote:
>> > There is an ES7 proposal for weak references[1] that would satisfy your
>> > requirements. However, at least at Mozilla there is very strong opposition
>> > to this from people working on the memory management subsystems (i.e. the GC
>> > and CC). It's not clear to me that their arguments have been defeated and
>> > I'm not aware of any more recent discussions about this topic than those on
>> > Mozilla's platform development mailing list[2][3].
>> >
>> > While I think that weak references are an important feature, I don't think
>> > this particular use case is a good argument for them: in my personal
>> > experience working with and implementing systems like you describe, weak
>> > listeners were eventually deprecated and replaced by forced explicity
>> > unsubscription every time. If a view is destroyed, you really don't want it
>> > to receive any events anymore, regardless of the GC's timing. Now you could
>> > say that in the framework's event dispatching or handling mechanism you can
>> > detect this situation. If so, you can also just unsubscribe a strongly-held
>> > event listener at that point.
>> >
>> >
>> > [1]: http://wiki.ecmascript.org/doku.php?id=strawman:weakreferences
>> > [2]:
>> > https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/V__5zqll3zc
>> > [3]:
>> > https://groups.google.com/forum/#!topic/mozilla.dev.tech.js-engine.internals/9LcqR9m5Mo4
>> >
>> >
>> >
>> > On Sun, Jul 6, 2014 at 7:00 AM, Peter Michaux <petermichaux at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I've been reading about WeakMap in the draft. To my surprise, it is
>> >> not at all what I thought it would be or what I was hoping to use. At
>> >> least that is my understanding.
>> >>
>> >> My use case is in MV* architectures. With current MV* frameworks, a
>> >> model holds strong references to the views observing that model. If a
>> >> view is removed from the DOM, all other references in the application
>> >> are lost to that view, but the view never stopped observing the model
>> >> object, that strong reference from model to view results in a zombie
>> >> view. Avoiding this means views need to have `destroy` methods that
>> >> unsubscribes the view from the model. It is easy for the application
>> >> programmer to forget to call a view's `destroy` method and the
>> >> application leaks memory. As a result of the leak, the user experience
>> >> and ultimately the reputation of the Web suffers. If a model could
>> >> hold weak references to its observers, this would safeguard against
>> >> accidental and inevitable application programmer forgetfulness.
>> >>
>> >> It appears that WeakMap cannot help solve the current MV* zombie view
>> >> problem. Or did I miss something?
>> >>
>> >> I was expecting WeakMap to hold its values weakly and set them to
>> >> undefined or delete the associated key when the value was garbage
>> >> collected.
>> >>
>> >> Does anything exist or is coming to help solve the zombie problem?
>> >>
>> >> ----
>> >>
>> >> Smalltalk Squeak models use a WeakIdentityKeyDictionary which holds
>> >> its keys weakly. The difference compared with the ECMAScript WeakMap
>> >> is that instances of WeakIdentityKeyDictionary have an iterator so the
>> >> observers can be stored as the keys and still discoverable without
>> >> keeping other strong references. The ECMAScript standard specifically
>> >> disallows in iterator.
>> >>
>> >> Thanks,
>> >> Peter
>> >> _______________________________________________
>> >> es-discuss mailing list
>> >> es-discuss at mozilla.org
>> >> https://mail.mozilla.org/listinfo/es-discuss
>> >
>> >
>> >
>> > _______________________________________________
>> > es-discuss mailing list
>> > es-discuss at mozilla.org
>> > https://mail.mozilla.org/listinfo/es-discuss
>> >
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
> 
> 
> 
> -- 
>     Cheers,
>     --MarkM
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140706/4ef31dd6/attachment-0001.html>

# Russell Leggett (11 years ago)

Sorry to take this on a tangent from the topic of WeakRefs, but the way I've solved the OP's problem in my own code is by tying anything that needs cleanup to element ids. Any time I need to update the HTML, I go through a central method that crawl that part of the dom and purges it using the ids as keys in maps of bindings/widgets. Has worked very well for me.

Sorry to take this on a tangent from the topic of WeakRefs, but the way
I've solved the OP's problem in my own code is by tying anything that needs
cleanup to element ids. Any time I need to update the HTML, I go through a
central method that crawl that part of the dom and purges it using the ids
as keys in maps of bindings/widgets. Has worked very well for me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140706/24b8f5bf/attachment.html>

# Till Schneidereit (11 years ago)

On Sun, Jul 6, 2014 at 7:45 PM, Filip Pizlo <fpizlo at apple.com> wrote:

Is the main opposition to weak references just the security implications of information revealed by GC? Has anyone quantified how much information is leaked, or proved that this information cannot be obtained through already exposed APIs or language features? I presume it has something to do with detecting if anyone else has a reference to an object.

Security is one concern, but I think that Mark's proposal covers this with the "only collect weakrefs between turns" semantics.

I CC'd a few people who voiced strong opposition on our dev mailing list. Posts containing arguments for their position are:

And an argument for alternative solutions to common weakref use cases:

groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/Zweq9VEqI0cJ

There's a lot more in that thread, but I think this roughly covers the main arguments against weakrefs.

On Sun, Jul 6, 2014 at 7:45 PM, Filip Pizlo <fpizlo at apple.com> wrote:

> On Jul 6, 2014, at 9:30 AM, "Mark S. Miller" <erights at google.com> wrote:
>
> On Sun, Jul 6, 2014 at 7:47 AM, Katelyn Gadd <kg at luminance.org> wrote:
>
>> There are some fairly recent es-discuss threads about weak references.
>> I don't know if there is a consensus yet (it is very hard to tell) but
>> some people like Brendan are on record that there are real use cases
>> that require them. I've been pushing hard for them, along with the
>> author of embind. The question is mostly whether solving those
>> problems is worth the cost of exposing GC to content JS (though, if
>> memory serves, there was a claim in one of the discussion threads that
>> you can implement weakrefs without exposing GC - I'm not sure if that
>> was an 'I've figured it out' statement or just a hypothesis).
>>
>
> I would be very curious. It seems impossible by definition. Could you (or
> anyone) please try to find this? Thanks.
>
> What I have claimed is that we can isolate the communications channel that
> this provides in ways that make it a reasonable (IMO) security risk.
> Perhaps this is what you are thinking of?
>
>
> I've read this exchange and might be missing context. I'm intrigued by it
> and want to know more.
>
> Is the main opposition to weak references just the security implications
> of information revealed by GC?  Has anyone quantified how much information
> is leaked, or proved that this information cannot be obtained through
> already exposed APIs or language features?  I presume it has something to
> do with detecting if anyone else has a reference to an object.
>

Security is one concern, but I think that Mark's proposal covers this with
the "only collect weakrefs between turns" semantics.

I CC'd a few people who voiced strong opposition on our dev mailing list.
Posts containing arguments for their position are:
https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/zCjQrnnMtAMJ
https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/FHx26ioYyUkJ
https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J
https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/IByxb_ZjulgJ

And an argument for alternative solutions to common weakref use cases:
https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/Zweq9VEqI0cJ

There's a lot more in that thread, but I think this roughly covers the main
arguments against weakrefs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140706/2c2e51b3/attachment.html>

# Filip Pizlo (11 years ago)

Thanks for gathering those links. I rather like Mark's proposal. Does anyone believe that there are security holes if we do the between-turns semantics?

My reading of the linked Mozilla discussions seems to be that some GC implementors think it's hard to get the feature right and that it pessimises the theoretical performance of some algorithms. Jvms have had this feature for a long time and at least one JS engine (JSC) has had it internally for years, and combining weak refs with all manner of exotic GCs is very well understood in the art.

Thanks for gathering those links. I rather like Mark's proposal. Does anyone believe that there are security holes if we do the between-turns semantics?

My reading of the linked Mozilla discussions seems to be that some GC implementors think it's hard to get the feature right and that it pessimises the theoretical performance of some algorithms. Jvms have had this feature for a long time and at least one JS engine (JSC) has had it internally for years, and combining weak refs with all manner of exotic GCs is very well understood in the art.

-Filip

> On Jul 6, 2014, at 11:27 AM, Till Schneidereit <till at tillschneidereit.net> wrote:
> 
>> On Sun, Jul 6, 2014 at 7:45 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>>> On Jul 6, 2014, at 9:30 AM, "Mark S. Miller" <erights at google.com> wrote:
>>> 
>>>> On Sun, Jul 6, 2014 at 7:47 AM, Katelyn Gadd <kg at luminance.org> wrote:
>>>> There are some fairly recent es-discuss threads about weak references.
>>>> I don't know if there is a consensus yet (it is very hard to tell) but
>>>> some people like Brendan are on record that there are real use cases
>>>> that require them. I've been pushing hard for them, along with the
>>>> author of embind. The question is mostly whether solving those
>>>> problems is worth the cost of exposing GC to content JS (though, if
>>>> memory serves, there was a claim in one of the discussion threads that
>>>> you can implement weakrefs without exposing GC - I'm not sure if that
>>>> was an 'I've figured it out' statement or just a hypothesis).
>>> 
>>> I would be very curious. It seems impossible by definition. Could you (or anyone) please try to find this? Thanks.
>>> 
>>> What I have claimed is that we can isolate the communications channel that this provides in ways that make it a reasonable (IMO) security risk. Perhaps this is what you are thinking of?
>> 
>> I've read this exchange and might be missing context. I'm intrigued by it and want to know more. 
>> 
>> Is the main opposition to weak references just the security implications of information revealed by GC?  Has anyone quantified how much information is leaked, or proved that this information cannot be obtained through already exposed APIs or language features?  I presume it has something to do with detecting if anyone else has a reference to an object. 
> 
> Security is one concern, but I think that Mark's proposal covers this with the "only collect weakrefs between turns" semantics.
> 
> I CC'd a few people who voiced strong opposition on our dev mailing list. Posts containing arguments for their position are:
> https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/zCjQrnnMtAMJ
> https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/FHx26ioYyUkJ
> https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J
> https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/IByxb_ZjulgJ
> 
> And an argument for alternative solutions to common weakref use cases:
> https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/Zweq9VEqI0cJ
> 
> There's a lot more in that thread, but I think this roughly covers the main arguments against weakrefs.
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140706/1c2fd701/attachment.html>

# Boris Zbarsky (11 years ago)

On 7/6/14, 4:11 PM, Filip Pizlo wrote:

My reading of the linked Mozilla discussions seems to be that some GC implementors think it's hard to get the feature right

I'm not sure how you can possibly read groups.google.com/forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J that way. That post isn't even from a GC implementor and says nothing about implementation issues!

I think that post presents the strongest argument I know against the "use GC to reclaim your non-memory resources" argument, and the summary is that while that approach looks promising at first glance in practice it leads to resources not being reclaimed when they should be because the GC is not aiming for whatever sort of resource management those particular resources want.

On 7/6/14, 4:11 PM, Filip Pizlo wrote:
> My reading of the linked Mozilla discussions seems to be that some GC
> implementors think it's hard to get the feature right

I'm not sure how you can possibly read 
https://groups.google.com/forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J 
that way.  That post isn't even from a GC implementor and says nothing 
about implementation issues!

I think that post presents the strongest argument I know against the 
"use GC to reclaim your non-memory resources" argument, and the summary 
is that while that approach looks promising at first glance in practice 
it leads to resources not being reclaimed when they should be because 
the GC is not aiming for whatever sort of resource management those 
particular resources want.

-Boris

# Jussi Kalliokoski (11 years ago)

To first address the particular case of using weak maps for custom event listeners via iteration:

I think the only relatively sane approach to iterating a WeakMap would be to force GC whenever the WeakMap is being iterated. This would make sure that you couldn't get references to items that are about to be garbage-collected (and thus don't also introduce non-deterministic errors and memory leaks for event listeners firing on disposed views). However, this would make iterating a WeakMap potentially unbearably slow and thus not worth using for this case. The performance hit may be tuned down by traversing the reference tree only from the items contained in the WeakMap, but I'm not sure if that's feasible and it would probably also make the performance worse if the WeakMap is large enough and has a lot of resources that are alive. Another drawback is that this would potentially lead to abuse where for example all views would be stored in a WeakMap and then the WeakMap would be iterated through just to force GC on the views.

On the discussion thread linked, it's also discussed that weakrefs would be used for DOM event listeners, but I'm not exactly sure if that's a very workable solution either. You'll basically get a weak reference locally, but the DOM event listener will still hold a strong reference to the function. You could of course add a weak addEventListener variant, but soon you'd notice that you also need a weak setTimeout, setInterval, requestAnimationFrame, Object.observe and maybe even weak promises. :/

All in all, I'm doubtful that weak references can solve the use cases presented very well. They would basically encourage people to start building frameworks that use weakrefs instead of lifecycle hooks only to notice that there's some part of the platform where they need manual reference clearing anyway. The solution, I think, is to just use frameworks and libraries like angular and react that provide these lifecycle hooks and take care that these hooks are triggered for you, instead of having to manually call a destroy method.

To first address the particular case of using weak maps for custom event
listeners via iteration:

I think the only relatively sane approach to iterating a WeakMap would be
to force GC whenever the WeakMap is being iterated. This would make sure
that you couldn't get references to items that are about to be
garbage-collected (and thus don't also introduce non-deterministic errors
and memory leaks for event listeners firing on disposed views). However,
this would make iterating a WeakMap potentially unbearably slow and thus
not worth using for this case. The performance hit may be tuned down by
traversing the reference tree only from the items contained in the WeakMap,
but I'm not sure if that's feasible and it would probably also make the
performance worse if the WeakMap is large enough and has a lot of resources
that are alive. Another drawback is that this would potentially lead to
abuse where for example all views would be stored in a WeakMap and then the
WeakMap would be iterated through just to force GC on the views.

On the discussion thread linked, it's also discussed that weakrefs would be
used for DOM event listeners, but I'm not exactly sure if that's a very
workable solution either. You'll basically get a weak reference locally,
but the DOM event listener will still hold a strong reference to the
function. You could of course add a weak addEventListener variant, but soon
you'd notice that you also need a weak setTimeout, setInterval,
requestAnimationFrame, Object.observe and maybe even weak promises. :/

All in all, I'm doubtful that weak references can solve the use cases
presented very well. They would basically encourage people to start
building frameworks that use weakrefs instead of lifecycle hooks only to
notice that there's some part of the platform where they need manual
reference clearing anyway. The solution, I think, is to just use frameworks
and libraries like angular and react that provide these lifecycle hooks and
take care that these hooks are triggered for you, instead of having to
manually call a destroy method.

Cheers,
Jussi

On Mon, Jul 7, 2014 at 4:49 AM, Boris Zbarsky <bzbarsky at mit.edu> wrote:

> On 7/6/14, 4:11 PM, Filip Pizlo wrote:
>
>> My reading of the linked Mozilla discussions seems to be that some GC
>> implementors think it's hard to get the feature right
>>
>
> I'm not sure how you can possibly read https://groups.google.com/
> forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J
> that way.  That post isn't even from a GC implementor and says nothing
> about implementation issues!
>
> I think that post presents the strongest argument I know against the "use
> GC to reclaim your non-memory resources" argument, and the summary is that
> while that approach looks promising at first glance in practice it leads to
> resources not being reclaimed when they should be because the GC is not
> aiming for whatever sort of resource management those particular resources
> want.
>
> -Boris
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140707/d7714b0a/attachment.html>

# Katelyn Gadd (11 years ago)

Jussi, one thing about your (totally correct) statements here is that you're addressing this from the perspective of 'I want to observe GC reliably from user code'. But that's not really what is desired in most cases.

For example, forcing a GC whenever iterating the weakmap would ensure you don't get a reference to an 'effectively dead' object, but nobody is likely to want that in most cases. The point isn't that you don't process objects that are about to die; the point is that weakrefs ensure that the GC can collect these object graphs that otherwise form uncollectable cycles.

Once the GC can collect them, the additional layer you want on top is that collections like a weakmap don't expose dead - already collected - objects to the user. It's fine if iteration yields an object that is about to be collected; in fact, it is probably good if it does. Anyone making use of weak references should, as a cost of entry, expect nondeterminism. The point is specifically that weakrefs ensure the GC can collect your object graphs, and they allow you to respond correctly once a graph is collected.

This is also why the 'use lifecycle hooks and/or manual reference clearing instead' solution isn't an alternative to weakrefs. It solves some use cases that you might otherwise solve with weakrefs, but it does so at the cost of considerable manual effort (and bugs/leaks when manual lifetime management is done incorrectly). For the use cases that can't be reliably solved by manual lifetime management, you still need weakrefs.

Similarly, it's important to realize that while some use cases for weakrefs are about managing native resources or doing other 'automatic cleanup' behaviors, many use cases are simply about ensuring that the GC can free up large graphs of dead objects as soon as memory pressure strikes instead of waiting until the (likely fragile or slow) user-space collector gets around to running and collecting user-space objects. Memory pressure is something the browser and JS VM have knowledge of that userspace doesn't know about - if a graph is effectively dead but can't easily have its references cleaned up automatically (as can happen in complex object layouts, where you would normally use refcounting or some other mechanism), it's possible it could remain 'alive' for a long period of time without weakrefs, eating up valuable heap, moving between GC generations, and slowing GCs.

WRs also enable safe interaction with third-party JS that isn't generally possible otherwise. This occurred to me the other day after I suggested the idea of a user-space JS collector that walks the visible JS heap from roots - you can't walk closures, so any JS object held in a closure would escape the sight of your collector (there are other problems, but this is the most obvious one).

Closures are used heavily in modern JS, and have the ability to retain references to a JS object. It becomes non-trivial to figure out the lifetime of a given closure and know when you need to manually release any resources it relies on, whether a graphics context or a big buffer in an asm.js heap. For a simple use case like a setInterval handler, you can manually clean up when removing the setInterval handler, but what if you have 3 different event listeners that all hold a reference to that resource in their closure? How do you clean those up at the appropriate time? The only vaguely reliable answer here is 'every consumer of my library has to painstakingly increment/decrement reference counts any time they retain a reference to my objects', which is not just tedious but extremely easy to mess up. This is further complicated by the fact that currently V8 and SpiderMonkey closures have the ability to capture references to values that are never actually used within the function, so JS that seems like it shouldn't retain an object actually retains it.

As before, manual lifetime management - where possible - is king, but there are far too many scenarios where it's either near-impossible or far too difficult to make it your only option. A combination of manual lifetime management + weakrefs for corner cases is the ideal approach here (and is in fact the approach used in some desktop scenarios), in my opinion. If we want to have robust, widespread manual lifetime management, people will either need to adopt non-JS languages that compile to JS (ensuring that all the elaborate lifetime management rules are followed), or JS needs to expose construct(s) to simplify lifetime management (C#-style using, python-style 'with resource' blocks, C++ scoped RAII). Even then, doing it in user space still requires all the JS running in your application to conform to these rules - once you pull in third party code, or run user scripts, your lifetime management is vulnerable to leaks if that outsider doesn't carefully follow the rules.

Jussi, one thing about your (totally correct) statements here is that
you're addressing this from the perspective of 'I want to observe GC
reliably from user code'. But that's not really what is desired in
most cases.

For example, forcing a GC whenever iterating the weakmap would ensure
you don't get a reference to an 'effectively dead' object, but nobody
is likely to want that in most cases. The point isn't that you don't
process objects that are *about* to die; the point is that weakrefs
ensure that the GC can collect these object graphs that otherwise form
uncollectable cycles.

Once the GC can collect them, the additional layer you want on top is
that collections like a weakmap don't expose *dead* - already
collected - objects to the user. It's fine if iteration yields an
object that is about to be collected; in fact, it is probably good if
it does. Anyone making use of weak references should, as a cost of
entry, expect nondeterminism. The point is *specifically* that
weakrefs ensure the GC can collect your object graphs, and they allow
you to respond correctly once a graph is collected.

This is also why the 'use lifecycle hooks and/or manual reference
clearing instead' solution isn't an alternative to weakrefs. It solves
some use cases that you might otherwise solve with weakrefs, but it
does so at the cost of considerable manual effort (and bugs/leaks when
manual lifetime management is done incorrectly). For the use cases
that can't be reliably solved by manual lifetime management, you still
need weakrefs.

Similarly, it's important to realize that while some use cases for
weakrefs are about managing native resources or doing other 'automatic
cleanup' behaviors, many use cases are simply about ensuring that the
GC can free up large graphs of dead objects *as soon as memory
pressure strikes* instead of waiting until the (likely fragile or
slow) user-space collector gets around to running and collecting
user-space objects. Memory pressure is something the browser and JS VM
have knowledge of that userspace doesn't know about - if a graph is
effectively dead but can't easily have its references cleaned up
automatically (as can happen in complex object layouts, where you
would normally use refcounting or some other mechanism), it's possible
it could remain 'alive' for a long period of time without weakrefs,
eating up valuable heap, moving between GC generations, and slowing
GCs.

WRs also enable safe interaction with third-party JS that isn't
generally possible otherwise. This occurred to me the other day after
I suggested the idea of a user-space JS collector that walks the
visible JS heap from roots - you can't walk closures, so any JS object
held in a closure would escape the sight of your collector (there are
other problems, but this is the most obvious one).

Closures are used heavily in modern JS, and have the ability to retain
references to a JS object. It becomes non-trivial to figure out the
lifetime of a given closure and know when you need to manually release
any resources it relies on, whether a graphics context or a big buffer
in an asm.js heap. For a simple use case like a setInterval handler,
you can manually clean up when removing the setInterval handler, but
what if you have 3 different event listeners that all hold a reference
to that resource in their closure? How do you clean those up at the
appropriate time? The only vaguely reliable answer here is 'every
consumer of my library has to painstakingly increment/decrement
reference counts any time they retain a reference to my objects',
which is not just tedious but extremely easy to mess up. This is
further complicated by the fact that currently V8 and SpiderMonkey
closures have the ability to capture references to values that are
never actually used within the function, so JS that seems like it
shouldn't retain an object actually retains it.

As before, manual lifetime management - where possible - is king, but
there are far too many scenarios where it's either near-impossible or
far too difficult to make it your only option. A combination of manual
lifetime management + weakrefs for corner cases is the ideal approach
here (and is in fact the approach used in some desktop scenarios), in
my opinion. If we want to have robust, widespread manual lifetime
management, people will either need to adopt non-JS languages that
compile to JS (ensuring that all the elaborate lifetime management
rules are followed), or JS needs to expose construct(s) to simplify
lifetime management (C#-style using, python-style 'with resource'
blocks, C++ scoped RAII). Even then, doing it in user space still
requires all the JS running in your application to conform to these
rules - once you pull in third party code, or run user scripts, your
lifetime management is vulnerable to leaks if that outsider doesn't
carefully follow the rules.

On Mon, Jul 7, 2014 at 1:19 AM, Jussi Kalliokoski
<jussi.kalliokoski at gmail.com> wrote:
> To first address the particular case of using weak maps for custom event
> listeners via iteration:
>
> I think the only relatively sane approach to iterating a WeakMap would be to
> force GC whenever the WeakMap is being iterated. This would make sure that
> you couldn't get references to items that are about to be garbage-collected
> (and thus don't also introduce non-deterministic errors and memory leaks for
> event listeners firing on disposed views). However, this would make
> iterating a WeakMap potentially unbearably slow and thus not worth using for
> this case. The performance hit may be tuned down by traversing the reference
> tree only from the items contained in the WeakMap, but I'm not sure if
> that's feasible and it would probably also make the performance worse if the
> WeakMap is large enough and has a lot of resources that are alive. Another
> drawback is that this would potentially lead to abuse where for example all
> views would be stored in a WeakMap and then the WeakMap would be iterated
> through just to force GC on the views.
>
> On the discussion thread linked, it's also discussed that weakrefs would be
> used for DOM event listeners, but I'm not exactly sure if that's a very
> workable solution either. You'll basically get a weak reference locally, but
> the DOM event listener will still hold a strong reference to the function.
> You could of course add a weak addEventListener variant, but soon you'd
> notice that you also need a weak setTimeout, setInterval,
> requestAnimationFrame, Object.observe and maybe even weak promises. :/
>
> All in all, I'm doubtful that weak references can solve the use cases
> presented very well. They would basically encourage people to start building
> frameworks that use weakrefs instead of lifecycle hooks only to notice that
> there's some part of the platform where they need manual reference clearing
> anyway. The solution, I think, is to just use frameworks and libraries like
> angular and react that provide these lifecycle hooks and take care that
> these hooks are triggered for you, instead of having to manually call a
> destroy method.
>
> Cheers,
> Jussi
>
>
>
> On Mon, Jul 7, 2014 at 4:49 AM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
>>
>> On 7/6/14, 4:11 PM, Filip Pizlo wrote:
>>>
>>> My reading of the linked Mozilla discussions seems to be that some GC
>>> implementors think it's hard to get the feature right
>>
>>
>> I'm not sure how you can possibly read
>> https://groups.google.com/forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J
>> that way.  That post isn't even from a GC implementor and says nothing about
>> implementation issues!
>>
>> I think that post presents the strongest argument I know against the "use
>> GC to reclaim your non-memory resources" argument, and the summary is that
>> while that approach looks promising at first glance in practice it leads to
>> resources not being reclaimed when they should be because the GC is not
>> aiming for whatever sort of resource management those particular resources
>> want.
>>
>> -Boris
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>

# Till Schneidereit (11 years ago)

I largely agree with your arguments, but one point is actually more of a counter argument to having weakrefs:

On Mon, Jul 7, 2014 at 10:41 AM, Katelyn Gadd <kg at luminance.org> wrote:

Similarly, it's important to realize that while some use cases for weakrefs are about managing native resources or doing other 'automatic cleanup' behaviors, many use cases are simply about ensuring that the GC can free up large graphs of dead objects as soon as memory pressure strikes instead of waiting until the (likely fragile or slow) user-space collector gets around to running and collecting user-space objects. Memory pressure is something the browser and JS VM have knowledge of that userspace doesn't know about - if a graph is effectively dead but can't easily have its references cleaned up automatically (as can happen in complex object layouts, where you would normally use refcounting or some other mechanism), it's possible it could remain 'alive' for a long period of time without weakrefs, eating up valuable heap, moving between GC generations, and slowing GCs.

While this is true, I think that, as others have argued in the discussion thread I linked to and elsewhere, weakrefs are a bad solution for this. The GC cannot distinguish between different types of resources and their freshness, so they'll just blow away everything they can. In most real-world cases, you'd want to take into account both how frequently and how recently a resource is/was accessed. And, of equal importance, how expensive it is to re-create. You can easily have a large, easily re-created buffer that you'd want to dump at the slightest hint of memory pressure (maybe even to prevent costly GCs from running?), while at the same time you have small-ish objects that are expensive to re-create, so you'd only do so under serious memory pressure.

All in all, I think the platform should expose tiered memory pressure notifications, regardless of whether weakrefs are introduced for other reasons.

I largely agree with your arguments, but one point is actually more of a
counter argument to having weakrefs:

On Mon, Jul 7, 2014 at 10:41 AM, Katelyn Gadd <kg at luminance.org> wrote:

> Similarly, it's important to realize that while some use cases for
> weakrefs are about managing native resources or doing other 'automatic
> cleanup' behaviors, many use cases are simply about ensuring that the
> GC can free up large graphs of dead objects *as soon as memory
> pressure strikes* instead of waiting until the (likely fragile or
> slow) user-space collector gets around to running and collecting
> user-space objects. Memory pressure is something the browser and JS VM
> have knowledge of that userspace doesn't know about - if a graph is
> effectively dead but can't easily have its references cleaned up
> automatically (as can happen in complex object layouts, where you
> would normally use refcounting or some other mechanism), it's possible
> it could remain 'alive' for a long period of time without weakrefs,
> eating up valuable heap, moving between GC generations, and slowing
> GCs.
>

While this is true, I think that, as others have argued in the discussion
thread I linked to and elsewhere, weakrefs are a bad solution for this. The
GC cannot distinguish between different types of resources and their
freshness, so they'll just blow away everything they can. In most
real-world cases, you'd want to take into account both how frequently and
how recently a resource is/was accessed. And, of equal importance, how
expensive it is to re-create. You can easily have a large, easily
re-created buffer that you'd want to dump at the slightest hint of memory
pressure (maybe even to prevent costly GCs from running?), while at the
same time you have small-ish objects that are expensive to re-create, so
you'd only do so under serious memory pressure.

All in all, I think the platform should expose tiered memory pressure
notifications, regardless of whether weakrefs are introduced for other
reasons.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140707/6b078990/attachment-0001.html>

# Katelyn Gadd (11 years ago)

On Mon, Jul 7, 2014 at 2:05 AM, Till Schneidereit <till at tillschneidereit.net> wrote:

While this is true, I think that, as others have argued in the discussion thread I linked to and elsewhere, weakrefs are a bad solution for this. The GC cannot distinguish between different types of resources and their freshness, so they'll just blow away everything they can. In most real-world cases, you'd want to take into account both how frequently and how recently a resource is/was accessed. And, of equal importance, how expensive it is to re-create. You can easily have a large, easily re-created buffer that you'd want to dump at the slightest hint of memory pressure (maybe even to prevent costly GCs from running?), while at the same time you have small-ish objects that are expensive to re-create, so you'd only do so under serious memory pressure.

All in all, I think the platform should expose tiered memory pressure notifications, regardless of whether weakrefs are introduced for other reasons.

Maybe I misunderstand, but you seem to be talking about caching? I'm talking about scenarios where the userspace code can't trivially verify whether an object is dead, and is okay with waiting until the next time the GC collects. Resource freshness and resource type don't matter in this case. The object just needs to be dead. I absolutely agree that weakrefs are not a solution for caching or pooling. My comments are in reference to scenarios where it is non-trivial to identify a point where your object graph is dead so you can go in and break references. In those scenarios you might use something like refcounting in order to ensure that no one component has to be responsible for deciding 'okay, it's dead now', and then you are subject to the types of leaks that occur when using refcounts as your lifetime management strategy (especially since you don't have WRs, which would otherwise mitigate the risk of leaks caused by refcounting+cycles.)

Memory pressure notifications are a neat idea but seem like they expose their own GC visibility and fingerprinting concerns. They would at least provide a good opportunity to trigger your own user-space garbage collections, as long as they can occur during an event loop turn instead of having to wait until the next one. If you can't get a pressure notification while a turn is going (as a result of your allocations, etc), that would hurt pressure notifications' viability as anything other than a way to respond to memory usage changes in other tabs/applications.

On Mon, Jul 7, 2014 at 2:05 AM, Till Schneidereit
<till at tillschneidereit.net> wrote:
> While this is true, I think that, as others have argued in the discussion
> thread I linked to and elsewhere, weakrefs are a bad solution for this. The
> GC cannot distinguish between different types of resources and their
> freshness, so they'll just blow away everything they can. In most real-world
> cases, you'd want to take into account both how frequently and how recently
> a resource is/was accessed. And, of equal importance, how expensive it is to
> re-create. You can easily have a large, easily re-created buffer that you'd
> want to dump at the slightest hint of memory pressure (maybe even to prevent
> costly GCs from running?), while at the same time you have small-ish objects
> that are expensive to re-create, so you'd only do so under serious memory
> pressure.
>
> All in all, I think the platform should expose tiered memory pressure
> notifications, regardless of whether weakrefs are introduced for other
> reasons.

Maybe I misunderstand, but you seem to be talking about caching? I'm
talking about scenarios where the userspace code can't trivially
verify whether an object is dead, and is okay with waiting until the
next time the GC collects.
Resource freshness and resource type don't matter in this case. The
object just needs to be dead. I absolutely agree that weakrefs are not
a solution for caching or pooling. My comments are in reference to
scenarios where it is non-trivial to identify a point where your
object graph is dead so you can go in and break references. In those
scenarios you might use something like refcounting in order to ensure
that no one component has to be responsible for deciding 'okay, it's
dead now', and then you are subject to the types of leaks that occur
when using refcounts as your lifetime management strategy (especially
since you don't have WRs, which would otherwise mitigate the risk of
leaks caused by refcounting+cycles.)

Memory pressure notifications are a neat idea but seem like they
expose their own GC visibility and fingerprinting concerns. They would
at least provide a good opportunity to trigger your own user-space
garbage collections, as long as they can occur during an event loop
turn instead of having to wait until the next one. If you can't get a
pressure notification while a turn is going (as a result of your
allocations, etc), that would hurt pressure notifications' viability
as anything other than a way to respond to memory usage changes in
other tabs/applications.

# Till Schneidereit (11 years ago)

On Mon, Jul 7, 2014 at 11:21 AM, Katelyn Gadd <kg at luminance.org> wrote:

Maybe I misunderstand, but you seem to be talking about caching? I'm talking about scenarios where the userspace code can't trivially verify whether an object is dead, and is okay with waiting until the next time the GC collects.

Ah, no, you didn't - I misunderstood your argument and did indeed think it was about caching. I'm still hesitant about this particular argument because it seems like your framework would still have issues with delayed cleanup if it relied on GC to do that. I know, however, that in practice it's Hard to ensure that all references in a complex system are properly managed (especially in scenarios involving third-party code as you describe), so I also don't think this can be outright dismissed.

Memory pressure notifications are a neat idea but seem like they expose their own GC visibility and fingerprinting concerns. They would at least provide a good opportunity to trigger your own user-space garbage collections, as long as they can occur during an event loop turn instead of having to wait until the next one. If you can't get a pressure notification while a turn is going (as a result of your allocations, etc), that would hurt pressure notifications' viability as anything other than a way to respond to memory usage changes in other tabs/applications.

That's a good point, and I'm pretty sure that notifications happening during a turn (job) just won't happen. You're right about the security and privacy concerns, I think. More fundamentally, they'd violate run-to-completion semantics, so I don't see how they'd even work. I think for the same reason in-turn collection of weakrefs would be impossible at least if their post-mortem finalizers are also supposed to be run in-turn.

On Mon, Jul 7, 2014 at 11:21 AM, Katelyn Gadd <kg at luminance.org> wrote:

> On Mon, Jul 7, 2014 at 2:05 AM, Till Schneidereit
> <till at tillschneidereit.net> wrote:
> > While this is true, I think that, as others have argued in the discussion
> > thread I linked to and elsewhere, weakrefs are a bad solution for this.
> The
> > GC cannot distinguish between different types of resources and their
> > freshness, so they'll just blow away everything they can. In most
> real-world
> > cases, you'd want to take into account both how frequently and how
> recently
> > a resource is/was accessed. And, of equal importance, how expensive it
> is to
> > re-create. You can easily have a large, easily re-created buffer that
> you'd
> > want to dump at the slightest hint of memory pressure (maybe even to
> prevent
> > costly GCs from running?), while at the same time you have small-ish
> objects
> > that are expensive to re-create, so you'd only do so under serious memory
> > pressure.
> >
> > All in all, I think the platform should expose tiered memory pressure
> > notifications, regardless of whether weakrefs are introduced for other
> > reasons.
>
> Maybe I misunderstand, but you seem to be talking about caching? I'm
> talking about scenarios where the userspace code can't trivially
> verify whether an object is dead, and is okay with waiting until the
> next time the GC collects.
>

Ah, no, you didn't - I misunderstood your argument and did indeed think it
was about caching. I'm still hesitant about this particular argument
because it seems like your framework would still have issues with delayed
cleanup if it relied on GC to do that. I know, however, that in practice
it's Hard to ensure that all references in a complex system are properly
managed (especially in scenarios involving third-party code as you
describe), so I also don't think this can be outright dismissed.


> Memory pressure notifications are a neat idea but seem like they
> expose their own GC visibility and fingerprinting concerns. They would
> at least provide a good opportunity to trigger your own user-space
> garbage collections, as long as they can occur during an event loop
> turn instead of having to wait until the next one. If you can't get a
> pressure notification while a turn is going (as a result of your
> allocations, etc), that would hurt pressure notifications' viability
> as anything other than a way to respond to memory usage changes in
> other tabs/applications.
>

That's a good point, and I'm pretty sure that notifications happening
during a turn (job) just won't happen. You're right about the security and
privacy concerns, I think. More fundamentally, they'd violate
run-to-completion semantics, so I don't see how they'd even work. I think
for the same reason in-turn collection of weakrefs would be impossible at
least if their post-mortem finalizers are also supposed to be run in-turn.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140707/8bdde0a1/attachment.html>

# Jussi Kalliokoski (11 years ago)

On Mon, Jul 7, 2014 at 12:44 PM, Till Schneidereit < till at tillschneidereit.net> wrote:

Ah, no, you didn't - I misunderstood your argument and did indeed think it was about caching. I'm still hesitant about this particular argument because it seems like your framework would still have issues with delayed cleanup if it relied on GC to do that. I know, however, that in practice it's Hard to ensure that all references in a complex system are properly managed (especially in scenarios involving third-party code as you describe), so I also don't think this can be outright dismissed.

True. However, I think that the non-determinism will not help the situation of a complex system as it can introduce more leaks. For example, the custom event handler scenario can trigger handlers that would otherwise be dead, and those handlers might cause other things to become active again, so it requires even deeper a level of understanding of the system to reason with this than with manual cleanup. I find this a similar issue to null pointer exceptions caused when somebody else cleans up your stuff but forgets to tell you, so the way I see it it's just replacing one class of problems with another.

Still, I also acknowledge that weak references have their place in making reasoning about systems easier. For example WeakMap already solves a lot of the problems that are caused by not knowing the lifecycle of a (possibly 3rd party) closure. For example, if the closure holds some state that's associated with an object provided as an input to the closure, it can use the object as the key and then GC can just do it's job as the closure holds no strong references to its inputs or outputs. I'm just not very convinced that adding any features that make GC observable solve any problems big enough to justify the problems caused.

On Mon, Jul 7, 2014 at 12:44 PM, Till Schneidereit <
till at tillschneidereit.net> wrote:

> Ah, no, you didn't - I misunderstood your argument and did indeed think it
> was about caching. I'm still hesitant about this particular argument
> because it seems like your framework would still have issues with delayed
> cleanup if it relied on GC to do that. I know, however, that in practice
> it's Hard to ensure that all references in a complex system are properly
> managed (especially in scenarios involving third-party code as you
> describe), so I also don't think this can be outright dismissed.
>

True. However, I think that the non-determinism will not help the situation
of a complex system as it can introduce more leaks. For example, the custom
event handler scenario can trigger handlers that would otherwise be dead,
and those handlers might cause other things to become active again, so it
requires even deeper a level of understanding of the system to reason with
this than with manual cleanup. I find this a similar issue to null pointer
exceptions caused when somebody else cleans up your stuff but forgets to
tell you, so the way I see it it's just replacing one class of problems
with another.

Still, I also acknowledge that weak references have their place in making
reasoning about systems easier. For example WeakMap already solves a lot of
the problems that are caused by not knowing the lifecycle of a (possibly
3rd party) closure. For example, if the closure holds some state that's
associated with an object provided as an input to the closure, it can use
the object as the key and then GC can just do it's job as the closure holds
no strong references to its inputs or outputs. I'm just not very convinced
that adding any features that make GC observable solve any problems big
enough to justify the problems caused.

Cheers,
Jussi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140707/59be00fc/attachment.html>

# Allen Wirfs-Brock (11 years ago)

On Jul 6, 2014, at 6:49 PM, Boris Zbarsky wrote:

On 7/6/14, 4:11 PM, Filip Pizlo wrote:

My reading of the linked Mozilla discussions seems to be that some GC implementors think it's hard to get the feature right

I'm not sure how you can possibly read groups.google.com/forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J that way. That post isn't even from a GC implementor and says nothing about implementation issues!

Well, I might disagree with part of this characterization ;-) www.wirfs-brock.com/allen/things/smalltalk-things/tektronix-smalltalk-document-archi BTW, George Bosworth, who I mentioned in that message is the person who originally came up with the Ephemeron idea.

I think that post presents the strongest argument I know against the "use GC to reclaim your non-memory resources" argument, and the summary is that while that approach looks promising at first glance in practice it leads to resources not being reclaimed when they should be because the GC is not aiming for whatever sort of resource management those particular resources want.

My position reflects the experience of not only implementing high perf GCs that include support for various kinds of weak references but also of observing and supporting the uses of a full stack commercial application development environment that exposed them. They just aren't the secret source that people expect them to be and this leads to obscure application level bugs and memory leaks which often go completely unnoticed.

On Jul 6, 2014, at 6:49 PM, Boris Zbarsky wrote:

> On 7/6/14, 4:11 PM, Filip Pizlo wrote:
>> My reading of the linked Mozilla discussions seems to be that some GC
>> implementors think it's hard to get the feature right
> 
> I'm not sure how you can possibly read https://groups.google.com/forum/#!msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J that way.  That post isn't even from a GC implementor and says nothing about implementation issues!

Well, I might disagree with part of this characterization ;-) http://www.wirfs-brock.com/allen/things/smalltalk-things/tektronix-smalltalk-document-archi
BTW, George Bosworth, who I mentioned in that message is the person who originally came up with the Ephemeron idea.

> 
> I think that post presents the strongest argument I know against the "use GC to reclaim your non-memory resources" argument, and the summary is that while that approach looks promising at first glance in practice it leads to resources not being reclaimed when they should be because the GC is not aiming for whatever sort of resource management those particular resources want.

My position reflects the experience of not only implementing high perf GCs that include support for various kinds of weak references but also of observing and supporting the uses of a full stack commercial application development environment that exposed them.  They just aren't the secret source that people expect them to be and this leads to obscure application level bugs and memory leaks which often go completely unnoticed.

Allen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20140709/f1a7e72f/attachment.html>

# Boris Zbarsky (11 years ago)

On 7/9/14, 12:21 PM, Allen Wirfs-Brock wrote:

Well, I might disagree with part of this characterization ;-)

I mean "current JS engine GC implementor", sorry. The important part is that your post is not about implementation difficulties in current JS GC implementations but about something entirely different.

On 7/9/14, 12:21 PM, Allen Wirfs-Brock wrote:
> Well, I might disagree with part of this characterization ;-)

I mean "current JS engine GC implementor", sorry.  The important part is 
that your post is not about implementation difficulties in current JS GC 
implementations but about something entirely different.

-Boris