WeakRefs leading to spec-imposed memory leaks?

# David Bruant (12 years ago)

There seems to be a consensus on bringing in WeakRefs in the language. I came around to the idea myself as some use cases seem to require it (as in: some use cases can't be achieved even with a ".dispose" convention like distributed acyclic garbage collection). However, I worry.

Recently, some memory leaks where found in Gaia (FirefoxOS front-end). The investigation led to a subtle nested function scopes condition under which the reference to some variables are kept alive unnecessarily 1.

One of these days the SpiderMonkey folks will certainly fix this bug and will that creates 2 worlds. A "before" world where some objects never gets collected because of the bug and an "after" world where some of the same objects get collected. If all current major JS engines have common limitations (1 or any other), I worry that code using WeakRefs could implicitly (and mistakenly!) rely on these common limitations and break if one of these limitation is fixed. We know the rest of the story; it involves browser competition, "don't break the web" and this time would put in spec to require some missed opportunities for optimization. Phrased differently, we could end up with memory leaks imposed by the spec...

I understand the necessity of WeakRefs for some use cases, but I worry.

# K. Gadd (12 years ago)

If memory serves, point.davidglasser.net/2013/06/27/surprising-javascript-memory-leak.html was also complaining about a similar closure/scope leak in v8 where locals that you wouldn't expect to be retained are retained by closures in some cases.

Arguably those cases just need to be fixed. Locals that aren't ever reachable from a closure being retained by the closure is definitely non-obvious and it's difficult to even identify or debug these cases with current debugging tools. JSIL has had huge memory leak issues caused by this in a few cases. Of course, I don't know how difficult it actually is to fix this.

I agree that WeakRefs going in with these sorts of leaks remaining would be a big problem, but it would be worse for developers if these leaks just stuck around and kept WRs from ever getting into the language. Better to put them in and have them serve as a strong motivator for fixing those leaks for good, IMO.

# David Bruant (12 years ago)

Le 27/07/2013 18:22, K. Gadd a écrit :

Of course, I don't know how difficult it actually is to fix this.

Difficulty is obviously one major concern. If this was easy to fix, I imagine it would have already been done; JS engine maintainers don't keep easy-to-fix leaks for fun. Also, apparently, it was easy in SpiderMonkey to make a tool that does the analysis 1. A comment by Jeff Walden 2 suggests that the leak may not be fixed in the short term, though. The reason is that the analysis to figure out which variable to keep track of is a bit costly and doing it upfront would slow down JS runtime performance. Performance is a finite blanket; pulling one way uncovers another part. It takes a massive amount of work to have a slightly wider blanket.

# Brendan Eich (12 years ago)

On the question of optimizing closures to entrain only what's used, I say VMs should get around to it as they feel the competitive need, without worrying about weak refs observing whether it's a supported optimization.

Weak refs are necessary for observer patterns, as we've discussed ad nauseum. I don't think we're going to reject them just because of potential observable closure-optimization interop breaks.

# David Bruant (12 years ago)

Le 27/07/2013 20:27, Brendan Eich a écrit :

Weak refs are necessary for observer patterns, as we've discussed ad nauseum.

That's not the conclusion I took from these discussions. As I feel words are important, the conclusions I took are: WeakRefs are necessary for distributed acyclic garbage collection. They are only convenient for observer patterns: people can always unsubscribe their listeners or use another equivalent ".dispose() protocol" (assuming the API has been thought with that in mind which it is not always). While ".dispose() protocols" are manual, explicit and always developer-initiated, WeakRefs are a uniform equivalent mechanism that is most often implicit (goes with the natural flow of programming, no need for explicit APIs) and GC-initiated (so half-automated and automated only in the cases the GC starts the cascade by breaking references). There is a consensus that the convenience is so important that it is felt as necessary and alone justifies the inclusion of WeakRefs.

Do you agree on this conclusion?

Sorry for re-hashing the same things over and over, but I feel it's important that people don't start thinking that WeakRefs are a silver-bullet that is necessary for memory management (in one-memory-space programs). It's already hard enough to make people stop talking about cycles anytime someone brings up garbage collection! (I'll happily answer questions on this topic off-list to anyone in doubt :-) )

On a related note, I recently had an experience writing an addon and using an API where the listener was held weakly by default 1. It took me some time to understand why my function (passed as "inline" function expression) was called only once... (granted, I should have read the doc more carefully. From the examples I read, I imagined the boolean was like the 3rd addEventListener argument. Another instance of the boolean trap 2 I guess). Anyway, the JS habit of writing "inline" function expressions as event listeners isn't very compatible with weak listeners.

I don't think we're going to reject them just because of potential observable closure-optimization interop breaks.

This optimization or another one. But fair enough; I just wanted to bring up the "threat" for evaluation.

# Brendan Eich (12 years ago)

David Bruant wrote:

That's not the conclusion I took from these discussions. As I feel words are important, the conclusions I took are: WeakRefs are necessary for distributed acyclic garbage collection. They are only convenient for observer patterns: people can always unsubscribe their listeners or use another equivalent ".dispose() protocol" (assuming the API has been thought with that in mind which it is not always).

You seemed to concede my point from esdiscuss/2013-February/028572


... Real systems such as COM, XPCOM, Java, and C# support weak references for good reasons. One cannot do "data binding" transparently without either making a leak or requiring manual dispose (or polling hacks), precisely because the lifecycle of the model and view data are not known to one another, and should not be coupled.

See strawman:weak_refs intro, on the observer and publish-subscribe patterns.


and esdiscuss/2013-February/028575


David Bruant wrote:

A view knows own its lifecycle, it involves adding observers in a bunch of places. When the view lifecycle comes to an end for whatever reason, it only makes sense that it removes the observers it added.

The problem is that the notification comes from a model object to the view via the observer. If the view holds the model object strongly, it can entrain the entire model. And if there is an association from model to view somewhere (which is not unreasonable, in a mostly self-hosted system), then....

with confirmation from Rafael Weinstein: esdiscuss/2013-March/028918


This is exactly right.


In the large, there's no single controller who can manually dispose of everything and avoid leaks. This has been rediscovered many times. I don't see where you refuted it, or even how you could via a priori arguments. It's a real problem in multi-maintainer, large-world software networks.

# David Bruant (12 years ago)

Le 28/07/2013 01:11, Brendan Eich a écrit :

with confirmation from Rafael Weinstein esdiscuss/2013-March/028918


This is exactly right.


let's quote a bit more from the same message:


Without WeakRefs, observation will require a dispose() step in order to allow garbage collection of observed objects, which is (obviously) very far from ideal.


And this is exactly what I meant by "convenience"; nothing more, nothing less. It's possible to do non-leaky observation without WeakRefs if the APIs of the objects involved take object disposal into account when designing their APIs, but it's just annoying (my interpretation of "very far from ideal")

In the large, there's no single controller who can manually dispose of everything and avoid leaks.

I never said or suggested that such a thing was needed. Where does this idea come from? There is no need for a "master controller" (which doesn't sound very OCap-friendly anyway) Each view and each model has some other object (maybe another view or model or maybe some other object) that created it and bound it respectively to a model or a view (or several). This creator/binder (doesn't need to be unique for all objects and most likely isn't) takes care of the model or view object lifecycle it's "responsible for" and can unbind when it becomes necessary in the object lifecycle. Or can dispose of the view/model data-bound pair at once without even needing to unlisten since they have each other's listener.

This has been rediscovered many times.

Did they try fine-grained lifecycle management? :-) If the first idea for large-scale software is a single master controller, I can imagine why it doesn't work. WeakRefs offer fine-grain distribution of control of who's holding a reference to whom and that solves the problem a single master controller CPOF. A .dispose convention backed by API so that whoever bound 2 objects unbinds them when one becomes unnecessary is another form of fine-grain distribution. To be honest, both are pretty much the same thing. The only difference is that in one case the developer triggers the cascade of disposal, in another case GC does it. Granted, GC-triggering is more convenient. It may also be more easily accepted to have the convention at the language level than at the developer level.

I don't see where you refuted it, or even how you could via a priori arguments. It's a real problem in multi-maintainer, large-world software networks.

As I said WeakRefs are a convenience. I don't mean that negatively. A convenience pays off at large-scale.

You seemed to concede my point from esdiscuss/2013-February/028572

and esdiscuss/2013-February/028575

I didn't answer as I didn't find a proper answer at the time. But my answer is above. For each view and model some other object takes care of the lifecycle. These "some other objects" (plural) are who can decide of the unbinding when necessary. If one of this "other object" is responsible for an otherwise independent model/view pair, it can throw them away, not even a need to unbind.

# Brendan Eich (12 years ago)

David Bruant wrote:

Each view and each model has some other object (maybe another view or model or maybe some other object) that created it and bound it respectively to a model or a view (or several). This creator/binder (doesn't need to be unique for all objects and most likely isn't) takes care of the model or view object lifecycle it's "responsible for" and can unbind when it becomes necessary in the object lifecycle. Or can dispose of the view/model data-bound pair at once without even needing to unlisten since they have each other's listener.

We're going in circles. From esdiscuss/2013-February/028575


The problem is that the notification comes from a model object to the view via the observer. If the view holds the model object strongly, it can entrain the entire model. And if there is an association from model to view somewhere (which is not unreasonable, in a mostly self-hosted system), then....


This was the message I was looking for when I first replied -- apologies for not finding it in time. You didn't reply directly.

"Convenience" sounds like some pampering measure that real programmers can do without. I don't think this is a fair assessment. Let's ask Yehuda, because Ember.js would benefit from weakrefs, and its API without them is harder to use and leak-prone.

But let's also assume your words are fair. You still conceded weakrefs are needed for distributed systems per Mark's original strawman. That's indisputable. So how are we going to avoid adding them to ES7 by calling them a convenience for other cases if they're required for the full distributed ocap case?

# David Bruant (12 years ago)

Le 28/07/2013 03:15, Brendan Eich a écrit :

This was the message I was looking for when I first replied -- apologies for not finding it in time. You didn't reply directly.

I feel I can't reply as I don't have enough information. Who created (and more generally is in charge of the lifecycle of) the model object? the view? the observer? My impression is that the answer can't be found only within the scope of these 3 objects but has to be found at a higher-level (level of whoever created them and/or manipulates them).

In essence, make me a complete list of who created who, who holds a reference to who (and there are necessarily other objects than the 3 cited) and I'll tell you who should be doing what when. I intuit (no formal proof) that any combination will have a solution where the memory can be manually dispose. I don't guarantee the solution will be simple to express in code :-) I also don't guarantee it will be as easy as with WeakRefs. But I feel it's possible.

"Convenience" sounds like some pampering measure that real programmers can do without.

I feel you're over-reacting on the word "convenience" (or maybe I've been misusing it?). There are lots of things I consider "conveniences" in ES6 like const, arrow functions or Array.prototype.find. It doesn't mean I don't want them in the language; quite the opposite. But I know people can do without these (and they sure have for a long time)

And don't put "real programmers" in my mouth unless it's to talk about butterflies xkcd.com/378 :-) That really isn't something I would say or even think.

But let's also assume your words are fair. You still conceded weakrefs are needed for distributed systems per Mark's original strawman. That's indisputable. So how are we going to avoid adding them to ES7 by calling them a convenience for other cases if they're required for the full distributed ocap case?

I didn't say I wanted them out (not on this thread and not anymore). I feel there is a mismatch between what I mean by "convenience" and what you understand.

I've gone through different phases about WeakRefs. I initially (time of the messages you quote) didn't understand the need. The discussions made me understand that one use case they cover is a language-level "finalizer mechanism". I disagreed with this use case by saying that people could also better design their APIs. Mark Miller then brought up the distributed acyclic GC use case 1 which, at first, was way beyond my understanding. After a discussion with Mark, he explained to me the link he had posted on es-discuss and it was clear that there was no other way than adding WeakRefs to the language [2]. Concurrently, after some maturation on the difference between a manual ".dispose() protocol" and WeakRefs it has become clear to me why and how they're convenient and I've come to not be too reluctant about them anymore even for this use case.

With all that said, as far as data binding or other "finalizer" use cases, WeakRefs remain a convenience (which to me means: "it's possible to do without"). As a convenience (and exactly like for class syntax or arrow function or Number.isNaN), it is important for authors to know what WeakRefs are convenient for, but also what problems they don't solve. I guess it's the same for classes. They are convenient to model some sorts of object models, but would be misused in some circumstances.

I feel that describing WeakRefs as necessary (for finalizer use cases) sends the wrong message. Easily misinterpretable as "we had leaks because we didn't have WeakRefs" while this is never true, I believe (within one memory space). You'll certainly agree that it'd be unfortunate if a word got misinterpreted ;-)

[2] well... actually there is, but it involves putting in the language a protocol like CapTP and that would not be desirable. That would probably be a lot of work for TC39 just to agree on the exact protocol. And forcing one particular protocol doesn't sound like a good idea. Imagine if someone comes along with another better protocol... WeakRefs empower JS devs to experiment all sorts of protocols. So WeakRefs are necessary at least for that reason.