Garbage collection in generators
On 17 February 2016 at 09:08, Benjamin Gruenbaum <benjamingr at gmail.com>
wrote:
In the following example:
function* foo() { try { yield 1; } finally { cleanup(); } } (function() { var f = foo(); f.next(); // never reference f again })()
- Is the iterator created by the function
foo
ever eligible for garbage collection?
The spec does not talk about GC, but in typical implementations you should expect yes.
- If it is - does it run the
finally
blocks?
No, definitely not. Try-finally has nothing to do with GC, it's just
control flow. If you starve a generator it's not going to get completed,
just like other control flow won't. (Which is why some of us think that
iterator return
is a misfeature, because it pretends to provide a
guarantee that does not exist, and only encourages bogus forms of resource
management.)
Related resources:
- Python changed the behavior to "run
return
on gc" in 2.5 docs.python.org/2.5/whatsnew/pep-342.html- C# doesn't run finalizers, but iterators are disposable and get aborted automatically by foreach (for... of) - on break. this is similar to what we do: blogs.msdn.com/b/dancre/archive/2008/03/14/yield-and-usings-your-dispose-may-not-be-called.aspx
- PHP is debating this issue now, I was contacted by PHP internals people about it which is how I came into the problem in the first place: bugs.php.net/bug.php?id=71604
- Related issue I opened on async/await : tc39/ecmascript-asyncawait#89
Even if we want to make GC observable via finalisation, then it should at least be done in a controlled and explicit manner rather than silently tacking it onto an unrelated feature. See the revived weakref proposal. Python's idea is just confused and crazy.
On Wed, Feb 17, 2016 at 10:28 AM, Andreas Rossberg <rossberg at google.com>
wrote:
The spec does not talk about GC, but in typical implementations you should expect yes.
Yes, important point since some ECMAScript implementations don't even have GC and are just run to completion. The spec doesn't require cleanup.
- If it is - does it run the
finally
blocks?
No, definitely not. Try-finally has nothing to do with GC, it's just control flow.
I'm not sure that "has nothing to do" is how I'd put it. try/finally is commonly used for resource management and garbage collection is a form of automatic resource management. There is also the assumption that finally is always run. Two other languages have opted into the "run finally" behavior so I wouldn't call it crazy although I tend to agree with the conclusion.
If you starve a generator it's not going to get completed, just like other control flow won't.
I'm not sure starving is what I'd use here - I definitely do see users do a pattern similar to:
function getResults*() {
try {
var resource = acquire();
for(const item of resource) yield process(item);
} finally {
release(resource);
}
}
They would then do something like:
var res = getResults();
var onlyCareAbout = Array.from(take(10, res));
// ignore res from this point on.
Now, in a for.. of loop with a break - return
would be called freeing the
resource - in this case the resource would stay "held up" forever - users
can call .return
explicitly on the generator but as a consumer of such
API I might not be aware that I need to.
Even if we want to make GC observable via finalisation, then it should at least be done in a controlled and explicit manner rather than silently tacking it onto an unrelated feature. See the revived weakref proposal.
I tend to agree.
On 17 February 2016 at 09:40, Benjamin Gruenbaum <benjamingr at gmail.com>
wrote:
If you starve a generator it's not going to get completed, just like other
control flow won't.
I'm not sure starving is what I'd use here - I definitely do see users do a pattern similar to:
function getResults*() { try { var resource = acquire(); for(const item of resource) yield process(item); } finally { release(resource); } }
Yes, exactly the kind of pattern I was referring to as "bogus forms of resource management". This is an anti-pattern in ES6. It won't work correctly. We should never have given the illusion that it does.
garbage collection is a form of automatic resource management.
Most GC experts would strongly disagree, if by resource you mean anything else but memory.
On Wed, Feb 17, 2016 at 10:51 AM, Andreas Rossberg <rossberg at google.com>
wrote:
On 17 February 2016 at 09:40, Benjamin Gruenbaum <benjamingr at gmail.com> wrote:
If you starve a generator it's not going to get completed, just like
other control flow won't.
I'm not sure starving is what I'd use here - I definitely do see users do a pattern similar to:
function getResults*() { try { var resource = acquire(); for(const item of resource) yield process(item); } finally { release(resource); } }
Yes, exactly the kind of pattern I was referring to as "bogus forms of resource management". This is an anti-pattern in ES6. It won't work correctly. We should never have given the illusion that it does.
What is or is not an anti-pattern is debatable. Technically if you call
.return
it will run the finally block and release the resources (although
if the finally block itself contains yield
those will also run).
Effectively, this will have the same sort of consequences that "acquire()"
and "release()" had to begin with - so I would not say it makes things
worse but I definitely agree that it creates a form of false expectation.
Still - I'm very curious why languages like Python have chosen to call
finally
blocks in this case - this was not a hindsight and according to
the PEP. They debated it and explicitly decided to call release
. I'll see
if I can email the people involved and ask about it.
garbage collection is a form of automatic resource management.
Most GC experts would strongly disagree, if by resource you mean anything else but memory.
Memory is most certainly a resource. Languages that are not GCd like C++ really don't make the distinction we make :)
Garbage collection can and does in fact manage resources in JavaScript host environments right now. For example, an XMLHttpRequest *may *abort the underlying HTTP request if the XMLHttpObject is not referenced anywhere and gets garbage collected.
Everyone, please keep in mind the following distinctions:
General GC is not prompt or predicable. There is an unspecified and unpredictable delay before anything non-reachable is noticed to be unreachable.
JavaScript GC is not specified to be precise, and so should be assumed conservative. Conservative GC may never notice that any particular unreachable thing is unreachable. The only reliable guarantee is that it will never collect anything which future computation will reach, i.e., it will not cause a spontaneous dangling reference. Beyond this, it provides only unspecified and, at best, probabilistic and partial cleanup.
C++ RAII and Python refcounting are completely different: they are precise, prompt, predictable, and deterministic.
C++ RAII and Python refcounting are completely different: they are
precise, prompt, predictable, and deterministic.
C++ RAII is indeed amazingly deterministic - as are languages with built in reference counters like Swift.
Python refcounting certainly is not since it performs cycle detection. Had it not performed cycle detection (mark & sweep as far as I recall in CPython). PyPy and other implementations have fuller garbage collection systems pypy.readthedocs.org/en/release-2.4.x/garbage_collection.html
It appears that Python GC with cycle detection predates the change in generators that gave them "finalization" (At least 2.2 where the change in Generators came in 2.5).
Python does however have destructors. They are weak (certainly weaker than
C++ destructors). Python also however has context management through the
with
statement like C# (using) and Java (try-with-resource).
Interestingly - they even have async/await aware disposers (async disposers
- added in 3.5).
On 2/17/16 3:59 AM, Benjamin Gruenbaum wrote:
Garbage collection can and does in fact manage resources in JavaScript host environments right now. For example, an XMLHttpRequest /may /abort the underlying HTTP request if the XMLHttpObject is not referenced anywhere and gets garbage collected.
If an implementation does that, it's clearly buggy. Consider:
function foo() { var xhr = new XMLHttpRequest(); xhr.addEventListener("load", function() { alert(this.responseText); } xhr.open(stuff); xhr.send(); } foo();
The XMLHttpRequest object is not "referenced anywhere" in JS terms between foo() returning and the load event being fired. But the load event really does need to be fired. Can you point me to the spec text that makes you think that in this situation not firing the load event would be an OK thing to do?
In practice what this means is that the UA needs to keep the object alive and prevent it being garbage collected until the end of the HTTP response is received.
-Boris
P.S. groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J has some comments on this exact issue of resource management via GC.
In the following example:
function* foo() { try { yield 1; } finally { cleanup(); } } (function() { var f = foo(); f.next(); // never reference f again })()
foo
ever eligible for garbage collection?finally
blocks?Related resources:
return
on gc" in 2.5 docs.python.org/2.5/whatsnew/pep-342.html