GC/Compile requests, with 3D engine demonstration
Am 12.03.2016 um 07:27 schrieb Brian Barnes:
Request 1: Add a System module, and one call I need System.gc();
Why: My game works in two parts; one part creates all the content, and then the second part plays it. In the second part, I do as much as I can to precreate/re-use any objects or arrays I need, to reduce GC. On the creation part, I don’t as much. So being able to force a GC at a point helps me, because it stops the creation part from forcing a pause in the game part.
On my user-should-experience-no-pause kind of application I'd also love to have the possibility to tell the engine to do a GC now as it's a good time point to do it.
Am 12.03.2016 um 07:27 schrieb Brian Barnes:
Request 1: Add a System module, and one call I need System.gc();
Why: My game works in two parts; one part creates all the content, and then the second part plays it. In the second part, I do as much as I can to precreate/re-use any objects or arrays I need, to reduce GC. On the creation part, I don’t as much. So being able to force a GC at a point helps me, because it stops the creation part from forcing a pause in the game part.
On 13 March 2016 at 01:05, Christian Mayer <mail at christianmayer.de> wrote:
On my user-should-experience-no-pause kind of application I'd also love to have the possibility to tell the engine to do a GC now as it's a good time point to do it.
Problem with that is that you're making assumptions that are not guaranteed about the engine already:
- That it uses a garbage collector. (I believe it's not entirely implausible to compile JS to a manually memory managing engine internal code, and I know that there are engines that never return any memory while still running, though they might reuse unreachables.)
- That the eventual garbage collector has certain properties such as pausing execution.
- That eventual garbage collector isn't incremental and running very small slices practically all the time, which might make precise user code control over it a deoptimisation hazard.
- That dead object detection, garbage collection, memory compaction and resource handling is dealt with in the ways of a typical conservative mark & sweep fashion.
- That ECMAScript actually specifies any of this. Neither garbage collection, nor memory handling strategies are a part of the spec. In fact, the word "memory" is only mentioned three places in the entire ECMAScript 2015 spec, and "garbage collection" only once.
As far as I know, any system that didn’t do GC would need to be compiled; you’d have to track the objects with some kind of graph (like ARC) and deal with it that way. Compiling isn’t something you’d probably want to always do to javascript, especially as so much of it runs once and only once — you’d only compile after that. Which is why the future of a non gc system seems unlikely at this point.
System APIs are full of potential noops, graphic and disk systems have flushes and cache empties which might or might not do anything, based on the architecture of the system. This isn’t really anything programmers haven’t seen before. So a gc() that does nothing on a system with some sort of advanced memory management is fine; the problem is already solved, the gc is a no-op. But you can’t not solve problems now because of future events.
As for being a de-optimization hazard, that’s certain a problem, but we shouldn’t disallow the usage of something because programmers are bad. We should teach programmers better.
It could certainly happen on a callback, too, which I think is perfectly fine. As far as I know, most all gcs I’ve seen have different modes, small gcs, and large gcs. A callback would solve any concurrency problems, and probably make a better system. It leaves the implementations details out of the hands of the user code.
[>] Brian
On 3/12/16 8:08 PM, Brian Barnes wrote:
As for being a de-optimization hazard, that’s certain a problem, but we shouldn’t disallow the usage of something because programmers are bad.
It's not a matter of badness or goodness. It's a matter of programmer expectations not matching the actual engine behavior. Which is particularly easy to have happen when there are multiple engines that have different behaviors. Not to mention that engines might want to change their behavior.
It could certainly happen on a callback, too, which I think is perfectly fine. As far as I know, most all gcs I’ve seen have different modes, small gcs, and large gcs.
I'd like to understand the proposal here better. What is happening off a callback? The gc, or code that wants to run after the gc?
On another note, the above quoted text is a good example of my claim above about expectations and behaviors. The SpiderMonkey GC (the JS GC I know most about) has 3 modes, I believe: (1) collect just in the nursery, (2) collect in the part of the heap that contains objects from a predetermined set of globals, treating any cross-global references into that part of the heap as roots, (3) collect across the entire JS heap, treating references from C++ to JS as roots. When running in Firefox there is a fourth mode: collect across the JS heap but also run the cycle collector in Gecko to detect reference cycles that go through the C++ heap and collect those too.
Which of those are small and which are large? And note that this set of modes is definitely not fixed in stone, and that for mode #2 the set of globals that constitute a single "zone" (what we call the sets in question) is not fixed in stone either across different Firefox versions...
On Mar 12, 2016, at 8:52 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 3/12/16 8:08 PM, Brian Barnes wrote:
As for being a de-optimization hazard, that’s certain a problem, but we shouldn’t disallow the usage of something because programmers are bad.
It's not a matter of badness or goodness. It's a matter of programmer expectations not matching the actual engine behavior. Which is particularly easy to have happen when there are multiple engines that have different behaviors. Not to mention that engines might want to change their behavior.
You might be thinking of this too technically. The contract would be, if the engine is GC (which is likely, as I pointed out without compiling or additions to the language), then do as much clean up as possible. If the system isn’t GC, then the problem you are looking to solve — which is to make GC happen more predictably in special circumstances — is no longer a problem.
For instance, you might have a system that has a flush() on it, this is something you see a lot. Depending on the actually physical storage method (sometimes an OS might have a file in flash and the flush means next to nothing, or an OS might always write though), but flush is STILL a useful and understood function.
APIs do something; if that something is no longer necessary, than the APIs are noops. Flushes would be like that.
It could certainly happen on a callback, too, which I think is perfectly fine. As far as I know, most all gcs I’ve seen have different modes, small gcs, and large gcs.
I'd like to understand the proposal here better. What is happening off a callback? The gc, or code that wants to run after the gc?
On another note, the above quoted text is a good example of my claim above about expectations and behaviors. The SpiderMonkey GC (the JS GC I know most about) has 3 modes, I believe: (1) collect just in the nursery, (2) collect in the part of the heap that contains objects from a predetermined set of globals, treating any cross-global references into that part of the heap as roots, (3) collect across the entire JS heap, treating references from C++ to JS as roots. When running in Firefox there is a fourth mode: collect across the JS heap but also run the cycle collector in Gecko to detect reference cycles that go through the C++ heap and collect those too.
Which of those are small and which are large? And note that this set of modes is definitely not fixed in stone, and that for mode #2 the set of globals that constitute a single "zone" (what we call the sets in question) is not fixed in stone either across different Firefox versions…
Small and large GCs, in my parlance, would be in fact quick and full GCs; if you watch most GC system, you’ll get tons of little collections, and then, eventually, a large collection. Almost all GCs behave this same way. Contractually, though, this would be “every GC you can do, on any object you can.” In FF, it would go through everything it could.
As for a callback, it would be after the GC has finished, or finished as well as it can. This does NOT counted threaded application; if you are running on web workers, or any other reason, than you would understand that all that can be cleaned is all that can be traced from your current execution.
Maybe System.gc() is a bad name, if people are worried about a fundamental change to how a engine does memory management. System.memoryCleanUp() — OK, that’s awful :)
[>] Brian
I don't believe this is really a good idea, but a workable approach could be something like a scope directive which disallows collection, eg
function renderParty(scene) {
"disallow gc";
// no GC pauses here...
}
but, engine heuristics might be better able to choose more appropriate times to collect. And, only really makes sense for synchronous code
On 3/12/16 9:11 PM, Brian Barnes wrote:
The contract would be, if the engine is GC (which is likely, as I pointed out without compiling or additions to the language), then do as much clean up as possible.
What does that mean, exactly? In the case of SpiderMonkey+Gecko, should that run the cycle collector? Should it run it multiple times (it might take multiple cycle collector runs with GC in between to collect everything that could be collected by a future collection). Should it evict caches that can cause things to become garbage when they get evicted? Should it perform maximal heap compaction (and therefore need slow heap growth on future allocations)?
Is this "as much cleanup as possible" allowed to take tens of seconds of wall-clock time, which is what you might need in some cases to really make sure you've found all the garbage?
And will people who make the call really expect all that to happen?
If the system isn’t GC, then the problem you are looking to solve — which is to make GC happen more predictably in special circumstances
The problem is that "GC" isn't a single thing that can happen. It's multiple different things that can happen, and have vastly different heuristics. It's not even necessarily clear which tasks fall under "GC" and which do not (c.f. cache evictions above).
For instance, you might have a system that has a flush() on it, this is something you see a lot. Depending on the actually physical storage method (sometimes an OS might have a file in flash and the flush means next to nothing, or an OS might always write though), but flush is STILL a useful and understood function.
That's because the operation is controls is somewhat simpler than "collect all the memory" (though in practice flush() is not actually understood in terms of what it means about the bits actually being in non-volatile storage or not....)
APIs do something
Sure. Whether that something is what you wanted, or even what anyone wants is a separate question.
Which of those are small and which are large? And note that this set of modes is definitely not fixed in stone, and that for mode #2 the set of globals that constitute a single "zone" (what we call the sets in question) is not fixed in stone either across different Firefox versions…
Small and large GCs, in my parlance, would be in fact quick and full GCs
I just pointed out that SpiderMonkey has 3 or 4 different kinds of GCs, which scale from "fast" to "slow". More if you include things like cache eviction. You came back with, again, wanting to categorize things into two buckets. I feel like we're talking past each other.
if you watch most GC system, you’ll get tons of little collections, and then, eventually, a large collection.
Note that I pointed out a specific production JS implementation, a fairly widely used one, where this is not exactly how things work. As in, I'm not raising a theoretical objection to assuming this model here.
Contractually, though, this would be “every GC you can do, on any object you can.”
See my questions above.
In FF, it would go through everything it could.
I really don't think you want an API that would take tens of seconds to run in some cases, which is what you're asking for here.
As for a callback, it would be after the GC has finished, or finished as well as it can.
Ah, to avoid requiring stopping the world. That's good, sure. Maybe then you really are OK with it taking a long time ...or never completing, if garbage is created between each async gc slice so we never "finish".
Maybe System.gc() is a bad name, if people are worried about a fundamental change to how a engine does memory management. System.memoryCleanUp() — OK, that’s awful :)
Note that Firefox does have some built-in "minimize memory usage" thing (you can get to it from about:memory). This is an async operation that will in fact go through all renderer processes and ask each of them 3 times (3 because 1 is not enough per comments above and we didn't really want an unbounded number) to flush all caches, do collection, etc.
I just tried it; took about 15 seconds with my set of tabs.... And of course it clears all caches (so likely throws out things you might in fact care about).
This doesn’t really solve the problem at hand; my code comes in two parts, the first generates a lot of objects, the second, generates almost none. The second part generates a 1-2 second paused GC once (because of the first part) and never again (because it constructs no more objects or only local primitives which can be collected more easily.)
Also, I don’t think you can ever be able to tell a gc type system to not collect during a certain point, it makes forced collections (because it’s completely out of memory) an error.
So, I don’t want to pause gc; I want to tell it “do one now, please.” The exact same way it might do it if it’s run out of space.
If you look at the application I sent out with this (www.klinksoftware.com/ws, www.klinksoftware.com/ws) it goes a long way to showing what I mean by all this.
[>] Brian
On Mar 12, 2016, at 9:31 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 3/12/16 9:11 PM, Brian Barnes wrote:
The contract would be, if the engine is GC (which is likely, as I pointed out without compiling or additions to the language), then do as much clean up as possible.
What does that mean, exactly? In the case of SpiderMonkey+Gecko, should that run the cycle collector? Should it run it multiple times (it might take multiple cycle collector runs with GC in between to collect everything that could be collected by a future collection). Should it evict caches that can cause things to become garbage when they get evicted? Should it perform maximal heap compaction (and therefore need slow heap growth on future allocations)?
Is this "as much cleanup as possible" allowed to take tens of seconds of wall-clock time, which is what you might need in some cases to really make sure you've found all the garbage?
And will people who make the call really expect all that to happen?
If the system isn’t GC, then the problem you are looking to solve — which is to make GC happen more predictably in special circumstances
The problem is that "GC" isn't a single thing that can happen. It's multiple different things that can happen, and have vastly different heuristics. It's not even necessarily clear which tasks fall under "GC" and which do not (c.f. cache evictions above).
This is key:
developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management, developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management
When the mozilla docs refer to GC, they refer to mark-and-sweep objects created by the user’s code. I think this is all that needs to be done. No weak references, either.
There’s certain other things that goes on, but they are things out of control of the coder. The objects the code creates are what we are interested in.
For instance, you might have a system that has a flush() on it, this is something you see a lot. Depending on the actually physical storage method (sometimes an OS might have a file in flash and the flush means next to nothing, or an OS might always write though), but flush is STILL a useful and understood function.
That's because the operation is controls is somewhat simpler than "collect all the memory" (though in practice flush() is not actually understood in terms of what it means about the bits actually being in non-volatile storage or not....)
APIs do something
Sure. Whether that something is what you wanted, or even what anyone wants is a separate question.
Which of those are small and which are large? And note that this set of modes is definitely not fixed in stone, and that for mode #2 the set of globals that constitute a single "zone" (what we call the sets in question) is not fixed in stone either across different Firefox versions…
Small and large GCs, in my parlance, would be in fact quick and full GCs
I just pointed out that SpiderMonkey has 3 or 4 different kinds of GCs, which scale from "fast" to "slow". More if you include things like cache eviction. You came back with, again, wanting to categorize things into two buckets. I feel like we're talking past each other.
if you watch most GC system, you’ll get tons of little collections, and then, eventually, a large collection.
Note that I pointed out a specific production JS implementation, a fairly widely used one, where this is not exactly how things work. As in, I'm not raising a theoretical objection to assuming this model here.
We’re not talking past each other, we are talking at different levels of grain. I’m talking about what the end user experiences; which is numerous quick GCs usually followed by a large GC. You are talking about the actual mechanics of what each of those GCs actually are, or do, but to the end user, they are quick pauses and large pauses.
But this has de-evolved; let’s reduce this to the regular mark-and-sweep collection of objects.
Contractually, though, this would be “every GC you can do, on any object you can.”
See my questions above.
In FF, it would go through everything it could.
I really don't think you want an API that would take tens of seconds to run in some cases, which is what you're asking for here.
As for a callback, it would be after the GC has finished, or finished as well as it can.
Ah, to avoid requiring stopping the world. That's good, sure. Maybe then you really are OK with it taking a long time ...or never completing, if garbage is created between each async gc slice so we never "finish".
Maybe System.gc() is a bad name, if people are worried about a fundamental change to how a engine does memory management. System.memoryCleanUp() — OK, that’s awful :)
Note that Firefox does have some built-in "minimize memory usage" thing (you can get to it from about:memory). This is an async operation that will in fact go through all renderer processes and ask each of them 3 times (3 because 1 is not enough per comments above and we didn't really want an unbounded number) to flush all caches, do collection, etc.
I just tried it; took about 15 seconds with my set of tabs.... And of course it clears all caches (so likely throws out things you might in fact care about).
Yes, I have no problem with it taking a long time, that would be part of the contract. It’s possible it could be also updated with the notion of how much recursion it should take, like:
System.gc(recursionCount,callback);
[>] Brian
On 3/12/16 9:52 PM, Brian Barnes wrote:
This is key:
developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management
When the mozilla docs refer to GC, they refer to mark-and-sweep objects created by the user’s code.
User code creates lots of objects in Firefox whose memory management model is not mark-and-sweep at the moment (for example, DOM nodes).
The docs you are looking at there are specifically for SpiderMonkey, not SpiderMonkey in a particular (browser) embedding.
I think this is all that needs to be done.
Not if you're actually trying to avoid memory collection pauses, sorry... This is what makes all this rather annoying.
There’s certain other things that goes on, but they are things out of control of the coder. The objects the code creates are what we are interested in.
OK. Note that you can have some control over things the browser creates too, not just things you create.
But again, the definition of "GC" as described on the page you link to doesn't cover everything that's touched by "GC" as it matters for an actual page in a web browser...
We’re not talking past each other, we are talking at different levels of grain. I’m talking about what the end user experiences; which is numerous quick GCs usually followed by a large GC. You are talking about the actual mechanics of what each of those GCs actually are, or do, but to the end user, they are quick pauses and large pauses.
OK. Say I have some gc pauses: 100us, 1ms, 5ms, 10ms, 20ms, 50ms, 600ms.
Which of those are quick pauses and which are large? As far as I can tell, the answer depends on the application, what sort of real-time guarantees it needs, and how much processing it does itself. About the only thing everyone would agree on is that 100us is small and 600ms is large.
But this has de-evolved; let’s reduce this to the regular mark-and-sweep collection of objects.
That would correspond to the first three modes I described, but even within that a given mode can in theory lead to either a quick pause or a large one, and you don't necessarily know up front which it will be.
In practice, all the modes are incremental and aim to not have anything resembling 600ms pauses. But if 5ms constitutes a "large" pause for you, then you might end up with "large" pauses from at least two of the modes. Not sure whether a nursery collection can ever end up there...
But my larger point is that if we define this to be "regular mark-and-sweep collection of objects" that might well not be enough to avoid big memory collection pauses in the future anyway. And if we don't define it to be that, then it's not clear what to define it to be because it really depends on what else is going on...
On Mar 12, 2016, at 10:05 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 3/12/16 9:52 PM, Brian Barnes wrote:
This is key:
developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management
When the mozilla docs refer to GC, they refer to mark-and-sweep objects created by the user’s code.
User code creates lots of objects in Firefox whose memory management model is not mark-and-sweep at the moment (for example, DOM nodes).
The docs you are looking at there are specifically for SpiderMonkey, not SpiderMonkey in a particular (browser) embedding.
We’re closing in on something! This could have been clearer on my end; I am looking at only javascript objects. My code, for instance, interacts only though a webgl context; it never creates any nodes or does anything outside of javascript. I think that would be way outside scope and not something you’d want to have in the GC scope, anyway.
I think anybody that called a gc() inside javascript code would assume that was the case.
I think this is all that needs to be done.
Not if you're actually trying to avoid memory collection pauses, sorry... This is what makes all this rather annoying.
There’s certain other things that goes on, but they are things out of control of the coder. The objects the code creates are what we are interested in.
OK. Note that you can have some control over things the browser creates too, not just things you create.
But again, the definition of "GC" as described on the page you link to doesn't cover everything that's touched by "GC" as it matters for an actual page in a web browser…
Yeah, again, I think this is where we are bumping heads over. I never meant it to matter at all to what goes on in the browser, just objects created in code.
This, basically:
developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/JSAPI_reference/JS_GC, developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/JSAPI_reference/JS_GC
Having System.gc() call that would solve my problem (I think.)
We’re not talking past each other, we are talking at different levels of grain. I’m talking about what the end user experiences; which is numerous quick GCs usually followed by a large GC. You are talking about the actual mechanics of what each of those GCs actually are, or do, but to the end user, they are quick pauses and large pauses.
OK. Say I have some gc pauses: 100us, 1ms, 5ms, 10ms, 20ms, 50ms, 600ms.
Which of those are quick pauses and which are large? As far as I can tell, the answer depends on the application, what sort of real-time guarantees it needs, and how much processing it does itself. About the only thing everyone would agree on is that 100us is small and 600ms is large.
But this has de-evolved; let’s reduce this to the regular mark-and-sweep collection of objects.
That would correspond to the first three modes I described, but even within that a given mode can in theory lead to either a quick pause or a large one, and you don't necessarily know up front which it will be.
In practice, all the modes are incremental and aim to not have anything resembling 600ms pauses. But if 5ms constitutes a "large" pause for you, then you might end up with "large" pauses from at least two of the modes. Not sure whether a nursery collection can ever end up there...
But my larger point is that if we define this to be "regular mark-and-sweep collection of objects" that might well not be enough to avoid big memory collection pauses in the future anyway. And if we don't define it to be that, then it's not clear what to define it to be because it really depends on what else is going on…
Honestly, I never thought I’d get into this discussion. I always saw this as something that would be different, from engine to engine, as engine GCs are different. The end user doesn’t have a lot of control or knowledge of the mechanics of such things. System.gc() would be the same thing. I know “nebulous” when it comes to an API is weird, but GC per-engine is different and unpredictable (not really, but to a human, yes.)
It would have to do something for every javascript engine, so it wouldn’t be something you could just spec as x/y/z. Best case, to clean up and compact the heap for the code you are currently running in.
[>] Brian
On 3/12/16 10:22 PM, Brian Barnes wrote:
We’re closing in on something! This could have been clearer on my end; I am looking at only javascript objects. My code, for instance, interacts only though a webgl context; it never creates any nodes or does anything outside of javascript.
But it creates WebGL objects like WebGLUniformLocation or WebGLTexture? Because those aren't mark-and-sweep either... They're much more like DOM nodes in terms of their memory management.
But again, the definition of "GC" as described on the page you link to doesn't cover everything that's touched by "GC" as it matters for an actual page in a web browser…
Yeah, again, I think this is where we are bumping heads over. I never meant it to matter at all to what goes on in the browser, just objects created in code.
I think we might have different definitions of "objects created in code". Are you including return values from getUniformLocation?
This, basically:
developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/JSAPI_reference/JS_GC
Having System.gc() call that would solve my problem (I think.)
What I'm trying to say is that it may well not, if it just called JS_GC and if the problem to be solved is pauses on the order of several tens to hundreds of milliseconds due to memory collection activity...
Honestly, I never thought I’d get into this discussion. I always saw this as something that would be different, from engine to engine, as engine GCs are different. The end user doesn’t have a lot of control or knowledge of the mechanics of such things.
OK. We agree so far. ;)
System.gc() would be the same thing.
OK, but then we run the very real risk of it not actually doing what people want. Or more precisely browsers implementing it in various different ways as they attempt to map its not-really-defined semantics onto their actual GC systems and then people sprinkling it about based on the GC heuristics and setup in one particular browser they tested with and optimized for, and then other browsers having to reverse engineer those and that one browser being locked into never changing how its GC operates.
And then we all lose. :(
On Mar 12, 2016, at 10:32 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 3/12/16 10:22 PM, Brian Barnes wrote:
We’re closing in on something! This could have been clearer on my end; I am looking at only javascript objects. My code, for instance, interacts only though a webgl context; it never creates any nodes or does anything outside of javascript.
But it creates WebGL objects like WebGLUniformLocation or WebGLTexture? Because those aren't mark-and-sweep either... They're much more like DOM nodes in terms of their memory management.
But again, the definition of "GC" as described on the page you link to doesn't cover everything that's touched by "GC" as it matters for an actual page in a web browser…
Yeah, again, I think this is where we are bumping heads over. I never meant it to matter at all to what goes on in the browser, just objects created in code.
I think we might have different definitions of "objects created in code". Are you including return values from getUniformLocation?
That’s my problem. All uniforms, all textures, are created once, rooted in my classes, and are never meant to be GC’d. They are not created on the fly. This is something you need to do to reduce GCs, and most game engines work this way, or they re-use these objects. This is a problem solvable at my level and just regular OpenGL programming outside of the language you use.
Remember, I’m not asking for a perfect solution; GC is mysterious in a number of ways, and they might be collectable on different engines, you never know. I’m asking for best practices.
Anything that I think of as a DOM object I understand is something that operates differently. Knowing which is which will probably be an engine detail; we can’t stop engines from being different. Minutia in an API is something that happens.
What’s a good way to say this? “Javascript objects attached only in your code.” Things like web audio, webgl, DOM elements, etc, don’t count, for example:
this.vertexPositionAttribute=view.gl.getAttribLocation(this.program,'vertexPosition’);
For instance, I understand that is attached outside the regular mark-and-sweep as it’s a GL object, attached to a context (maybe, I’m not privy to the implementation detail, these are usually just indexes as OpenGL understands them.)
There would be exceptions, like single-fire objects like some of the web audio API.
This, basically:
developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/JSAPI_reference/JS_GC
Having System.gc() call that would solve my problem (I think.)
What I'm trying to say is that it may well not, if it just called JS_GC and if the problem to be solved is pauses on the order of several tens to hundreds of milliseconds due to memory collection activity…
Time is not my concern. I’m trading time now for time when it’s critical.
Honestly, I never thought I’d get into this discussion. I always saw this as something that would be different, from engine to engine, as engine GCs are different. The end user doesn’t have a lot of control or knowledge of the mechanics of such things.
OK. We agree so far. ;)
System.gc() would be the same thing.
OK, but then we run the very real risk of it not actually doing what people want. Or more precisely browsers implementing it in various different ways as they attempt to map its not-really-defined semantics onto their actual GC systems and then people sprinkling it about based on the GC heuristics and setup in one particular browser they tested with and optimized for, and then other browsers having to reverse engineer those and that one browser being locked into never changing how its GC operates.
And then we all lose. :(
That’s modern web development, my code uses classes, and I had to fake them until FF 45 came out, and then realized they failed unless I was operating in strict under chrome; Safari doesn’t have let and nobody has proper implementation of stereo panning in web audio :) Pointerlock? Missing in action in Safari. For some reason, it’s slow to start up in Safari, but fast in FF and Chrome, but runs as fast in all 3, but there’s weird slow downs in Edge. Why? Who knows. That’s the price you pay. We wouldn’t have a single API if we were worried about that.
Not asking for miracles; I’m asking for best practice on an engine. We have an analog here — on FF, a call to JS_GC() is all I’d ask for.
[>] Brian
Am 13.03.2016 um 01:45 schrieb liorean:
On 13 March 2016 at 01:05, Christian Mayer <mail at christianmayer.de> wrote:
On my user-should-experience-no-pause kind of application I'd also love to have the possibility to tell the engine to do a GC now as it's a good time point to do it.
Problem with that is that you're making assumptions that are not guaranteed about the engine already:
- That it uses a garbage collector. [...]
What would work is a sort of hint to the engine that it can do some background work now that could affect the user experience as the code thinks the user won't notice it at that point in time.
When the engine is working in such a way that it doesn't need that background work (e.g. it has no GC, ...) than this call would be a NOP. But it would help to create a better user experience on a different engine.
So, over all, calling it CG might be bad, as it is more universal. We could call it DoStutteringBackgroundStuff() ;)
I'd like to chime in here on the GC. I make my living writing WebGL code for a living. JS'es stop-the-world mark&sweep for however long it takes approach to GC'ing is very troublesome.
A new frame has to be produced every 16.6ms (and in some cases as in the merging WebVR implementations every 11.1ms or even 8.3ms). And if that is delayed in any way, what occurs is jitter. One or several frames are missed to be drawn until a new frame can be drawn, this is a noticeable effect to many users. But it's even worse for VR usage because jitter is much more readily apparent in the case that your head movement no longer produces a new picture.
But the pernicious effects of GC'ing are already readily apparent even without strict realtime requirements. Pretty much every JS library (like jQuery UI) produces very unsmooth animations among other things, because of this.
Writing code to get around JS'es GC is possible, but it complicates everything quite a lot (effectively your drawing loop cannot allocate anything, ever).
The GC-needs of different applications might differ a lot. Some might prefer a GC that's using as little time as possible, but might occasionally stop the world for long periods of time. Other applications might be happy to cede as much as 1/4 of their CPU time to the GC at a clip of 60hz, 90hz or 120hz but be guaranteed that the GC is never going to occupy more time than that.
Adding some more flexible ways to deal with GC'ing beyond "just making it better" would be highly welcome. Provided that incremental/realtime GCs are probably never gonna happen for JS, the next best thing would probably be to at least be able to select a GC strategy and set its parameters that suit your use-case.
On Mar 13, 2016, at 6:27 AM, Florian Bösch <pyalot at gmail.com> wrote:
I'd like to chime in here on the GC. I make my living writing WebGL code for a living. JS'es stop-the-world mark&sweep for however long it takes approach to GC'ing is very troublesome.
A new frame has to be produced every 16.6ms (and in some cases as in the merging WebVR implementations every 11.1ms or even 8.3ms). And if that is delayed in any way, what occurs is jitter. One or several frames are missed to be drawn until a new frame can be drawn, this is a noticeable effect to many users. But it's even worse for VR usage because jitter is much more readily apparent in the case that your head movement no longer produces a new picture.
But the pernicious effects of GC'ing are already readily apparent even without strict realtime requirements. Pretty much every JS library (like jQuery UI) produces very unsmooth animations among other things, because of this.
Writing code to get around JS'es GC is possible, but it complicates everything quite a lot (effectively your drawing loop cannot allocate anything, ever).
The GC-needs of different applications might differ a lot. Some might prefer a GC that's using as little time as possible, but might occasionally stop the world for long periods of time. Other applications might be happy to cede as much as 1/4 of their CPU time to the GC at a clip of 60hz, 90hz or 120hz but be guaranteed that the GC is never going to occupy more time than that.
Adding some more flexible ways to deal with GC'ing beyond "just making it better" would be highly welcome. Provided that incremental/realtime GCs are probably never gonna happen for JS, the next best thing would probably be to at least be able to select a GC strategy and set its parameters that suit your use-case.
Hey Florian, I feel your pain :)
This is actually a separate problem, and one that probably needs it’s own solution, but who’s solution will probably be related. I actually do what you are talking about (I only allocate locals — still bad but GCs quick — in my game loop, everything else is pre-allocated or re-used, sort of how asm.js functions with it’s giant array for memory) but you are right, that makes some messy code and code that’s prime for race conditions.
I think if javascript ever gets native typing (for primitives) there can be a lot more stack based locals and that would fix a lot of problems but that’s WAY out of scope for what I’m talking about.
[>] Brian
This is a good time to bring up the other half of my original email because a number of other people have chimed in with their experiences with GC when attempting to develop more time critical applications without stutter.
The second part was a hint to tell the engine to always take the most aggressive route with optimization; for instance, in Safari’s engine, as I remember, I think there are three levels, interpreted, a half-n-half solution, and an actual full compile of the code. This hint would say “always compile to native” or if an engine never goes that far, always compile to the VM soup (though I suspect at this point most engines can do a native version.)
This would be used, again, for trading time in one place for time in another. A longer start-up time is something you want for a thing that will run continually and will almost always be guaranteed to fall into the compile path eventually. It’s not something you’d want for javascript that runs on a normal button click to post a form.
Being compiled gives you another benefit, say you have this class:
class …. { test() { let x,y,z; … …. return(x); }
The locals y and z are never move beyond the function scope; in this manner they could be stack based variables that never touch the heap (yes, I know that’s probably a very difficult implementation.) Or even variables that fall into a special section of the heap that only deals with locals that never scope outside the function and are always cleaned up automatically on function exit (basically, reference counting where you know the reference is always 0.)
This will basically reduce the amount of GCs by a good bit. NOW, you’d also have to track any additional function calls that use these variables (to make sure they aren’t save out to a global.) So everything underneath would be a force compile for this to work. Engines might already do this or stuff like it, and if so, great!
This is probably something that is much more complicated to implement that most of us might realize, but it’s a thought. Just being able to say “this class is always compiled” would be nice.
The second part of this is native primitive types; having int/etc means they can be passed by value which means these checks are easier, but that’s probably something others have argued back and forth for a long time :)
[>] Brian
On 3/12/16 10:52 PM, Brian Barnes wrote:
What I'm trying to say is that it may well not, if it just called JS_GC and if the problem to be solved is pauses on the order of several tens to hundreds of milliseconds due to memory collection activity…
Time is not my concern. I’m trading time now for time when it’s critical.
I understand that. What I am saying is that if you call JS_GC now, that doesn't mean that a cycle collection won't happen 30 seconds from now. To help prevent that, you'd have to run cycle collection now.
In other words, JS_GC would help somewhat in your situation, but not completely. And maybe tomorrow it would help less than today. Or maybe more...
OK, but then we run the very real risk of it not actually doing what people want. Or more precisely browsers implementing it in various different ways as they attempt to map its not-really-defined semantics onto their actual GC systems and then people sprinkling it about based on the GC heuristics and setup in one particular browser they tested with and optimized for, and then other browsers having to reverse engineer those and that one browser being locked into never changing how its GC operates.
And then we all lose. :(
That’s modern web development
No, I don't think it is. The way we prevent problems like this from arising is by actually creating standards, so that browsers know what behavior they need to have and consumers know what behavior they can depend on.
my code uses classes, and I had to fake them until FF 45 came out, and then realized they failed unless I was operating in strict under chrome; Safari doesn’t have let and nobody has proper implementation of stereo panning in web audio :)
Yes, but these are transitional problems. I'm talking about it being a problem when current implementations constrain both each other and future evolution of themselves because someone comes to depend on undocumented details of them.
This also happens in web development, but it's something no one likes to happen....
Not asking for miracles; I’m asking for best practice on an engine.
The problems start when the best practice is different for different engines and people optimize their code based on only one engine.
On 3/13/16 6:27 AM, Florian Bösch wrote:
JS'es stop-the-world mark&sweep for however long it takes approach to GC'ing is very troublesome.
That's certainly not the way SpiderMonkey's GC is meant to work (it's an incremental GC with a per-slice time budget)...
Adding some more flexible ways to deal with GC'ing beyond "just making it better" would be highly welcome.
Yes, this I agree on.
On 03/13/2016 01:06 PM, Brian Barnes wrote:
This is a good time to bring up the other half of my original email because a number of other people have chimed in with their experiences with GC when attempting to develop more time critical applications without stutter.
I really don't think you want a System.gc() call for that. What if you call that after you're placed in a background tab, when you're sharing the JS heap with a latency-sensitive foreground tab? Do you really want to stutter the foreground tab for (up to) a few seconds? Probably not, in which case the name System.gc() would be a lie.
I think the closest you would want to go is to offer hints to the runtime. AppIsIdleStartingNowForAtLeast(500)? IDoNotMindIfYouDontCallMeAgainFor(500)? (units are milliseconds). The runtime can ignore it, or it can schedule an up-to-500ms incremental GC slice, or whatever. That does not negate all of the issues Boris was referring to, but IMHO it's a reasonable middle ground. We could double-check it against pending timers or whatever.
The second part was a hint to tell the engine to always take the most aggressive route with optimization; for instance, in Safari’s engine, as I remember, I think there are three levels, interpreted, a half-n-half solution, and an actual full compile of the code. This hint would say “always compile to native” or if an engine never goes that far, always compile to the VM soup (though I suspect at this point most engines can do a native version.)
That's a nice simple mental model, but it's inaccurate in some important ways.
This would be used, again, for trading time in one place for time in another. A longer start-up time is something you want for a thing that will run continually and will almost always be guaranteed to fall into the compile path eventually. It’s not something you’d want for javascript that runs on a normal button click to post a form.
But you may need to run it in a slower mode 1 or more times in order for that native compilation to be effective. The fastest JIT levels compile the code under certain simplifying assumptions to make the generated code more efficient -- it may assume that you're not going to randomly add or delete a property from an object handled by the compiled code, for example, so it can compile in direct offsets to the object's slots. If you violate one of those assumptions, the compiled code will need to be discarded and re-generated with a new weaker set of assumptions.
So you don't want to go straight to the highest optimization level, as it might then not optimize as highly. You need to first run in a slower mode, perhaps even one that pays some overhead in order to gather information about the types and control paths actually used. And that's the next gotcha -- it's reasonable to skip collecting that info on the first time (or few times?) you run the code, because the majority of code is only run once so any gathered profiling info is useless.
Which is not to say that there's nothing useful you can tell the system. If you hinted to it that something was going to be important and run frequently, then it could choose to gather profiling information earlier and additionally lower the threshold for jumping up optimization levels. As with the GC case, though, you do not want to tell the VM exactly what to do. The best approach might be something like: spawn off a background compile, and in the meantime interpret the code without gathering profiling info, then switch to the compiled profiling code when it's ready, and then do an optimized compile based on that information as soon as you've observed "enough". The user code has no way to know how long that background compilation will take, whether there are spare resources to do it in the background, etc. And it varies by platform. So at best, I think you can drop hints like "this code is going to run a lot, and I'm pretty sure its runtime matters to me." In practice, people will end up cutting & pasting the magic bits that make things fast from stackoverflow and misapplying them all of the place, to the extent that engines might end up just completely ignoring them, but it may also turn out that they're a good enough signal that engines will pay attention. I don't know; I can't predict.
Being compiled gives you another benefit, say you have this class:
class …. { test() { let x,y,z; … …. return(x); }
The locals y and z are never move beyond the function scope; in this manner they could be stack based variables that never touch the heap (yes, I know that’s probably a very difficult implementation.) Or even variables that fall into a special section of the heap that only deals with locals that never scope outside the function and are always cleaned up automatically on function exit (basically, reference counting where you know the reference is always 0.)
I'm a little confused about variables vs values here, but simply storing a value into a local does not guarantee that the value is only references by that variable. You can sometimes do an escape analysis to figure this sort of thing out, and I believe many engines do this or are working on doing this. But either way, requesting that something be compiled is absolutely the wrong signal -- analysis is separate from compilation, and in fact information necessary for an analysis is less likely to be collected at the highest optimization levels. (It would either be figured out when initially compiling to bytecode or whatever, or dynamically gathered during early or mid-tier optimization levels.)
This will basically reduce the amount of GCs by a good bit. NOW, you’d also have to track any additional function calls that use these variables (to make sure they aren’t save out to a global.) So everything underneath would be a force compile for this to work. Engines might already do this or stuff like it, and if so, great!
They do stuff like this, but it's independent of (and complicated by) compilation.
This is probably something that is much more complicated to implement that most of us might realize, but it’s a thought. Just being able to say “this class is always compiled” would be nice.
I don't think so. It's not even really the right level of granularity. Some class methods should be compiled while others are interpreted, for the lowest overall time.
The second part of this is native primitive types; having int/etc means they can be passed by value which means these checks are easier, but that’s probably something others have argued back and forth for a long time :)
The dynamic profiling can figure out whether in practice particular values are always int/etc while certain code executes, and compile with that assumption. Declaring types can give that a head start, but it'll still need to be double-checked when not easily statically provable, and may end up just wasting time prematurely compiling code with incorrect assumptions. JS right now is simply too dynamic to prevent all possibility of strange things happening to seemingly simple values. Besides, just observing the types seems to work pretty well in practice. The main benefit of type declarations would be in preventing errors via type checking, IMO.
On 3/13/16 4:06 PM, Brian Barnes wrote:
The second part was a hint to tell the engine to always take the most aggressive route with optimization
By which I assume you mean still running in the lower interpreter/JIT levels until you have gathered sufficient profiling information, but not gating the higher JIT levels on hotness per se?
I think one problem here is that right now hotness is used as a proxy for profiling information in at least some JITs; once code is hot enough you figure you've got enough information about its types. Not a perfect assumption, of course.
A longer start-up time is something you want for a thing that will run continually and will almost always be guaranteed to fall into the compile path eventually.
Again, the fastest compile path relies on having run on the slower ones for a bit to gather the information needed to produce the fast code.
For example, you can run SpiderMonkey in --ion-eager mode, in which it immediately tries to compile with the last-level JIT. This produces slower code than you get if you run in the first-level JIT for a bit first...
Having explicit types would help with this, of course.
On Mar 13, 2016, at 5:16 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 3/12/16 10:52 PM, Brian Barnes wrote:
What I'm trying to say is that it may well not, if it just called JS_GC and if the problem to be solved is pauses on the order of several tens to hundreds of milliseconds due to memory collection activity…
Time is not my concern. I’m trading time now for time when it’s critical.
I understand that. What I am saying is that if you call JS_GC now, that doesn't mean that a cycle collection won't happen 30 seconds from now. To help prevent that, you'd have to run cycle collection now.
In other words, JS_GC would help somewhat in your situation, but not completely. And maybe tomorrow it would help less than today. Or maybe more…
“Perfect is the enemy of the good.”
I’m asking for a specific, useful function, not the perfect solution to my problem. The perfect solution is more complex, and requires things like never allocating any other large objects, or sometimes any objects at all. This I have to do now. There is no way around it. In the future, I might not have to, and it’ll be wasted code, but now, I have to as does anybody else that wants time sensitive code.
OK, but then we run the very real risk of it not actually doing what people want. Or more precisely browsers implementing it in various different ways as they attempt to map its not-really-defined semantics onto their actual GC systems and then people sprinkling it about based on the GC heuristics and setup in one particular browser they tested with and optimized for, and then other browsers having to reverse engineer those and that one browser being locked into never changing how its GC operates.
And then we all lose. :(
That’s modern web development
No, I don't think it is. The way we prevent problems like this from arising is by actually creating standards, so that browsers know what behavior they need to have and consumers know what behavior they can depend on.
my code uses classes, and I had to fake them until FF 45 came out, and then realized they failed unless I was operating in strict under chrome; Safari doesn’t have let and nobody has proper implementation of stereo panning in web audio :)
Yes, but these are transitional problems. I'm talking about it being a problem when current implementations constrain both each other and future evolution of themselves because someone comes to depend on undocumented details of them.
This also happens in web development, but it's something no one likes to happen....
Not asking for miracles; I’m asking for best practice on an engine.
The problems start when the best practice is different for different engines and people optimize their code based on only one engine.
That’s not our problem, and that’s not a solvable problem. Every engine will try to get ahead by some manner, and something like compiling, or GC, which isn’t spec nor should it be, will always be evolving. A System.gc() will in no way constrain further development. It’s not a guarantee to do anything but “free up objects that are no longer reachable and compact the heap.” Period. If there is no longer a heap, or no objects to free, it’s a noop.
The contract is not “make my code run faster.” It’s not “don’t do it again in a little bit.” It’s not any of those things.
I think you’re trying to force the API to be a problem solver; my problem is just a reasoning why it’s useful. Small contract: Free up objects that are no longer reachable and compact the heap, the objects being objects that aren’t attached by outside APIs (DOM, webgl, web audio, etc.)”
[>] Brian
Boris wrote:
Having explicit types would help with this, of course.
I think that’s the key. As you and Steve note, the slower paths are the ones that gather information on what the types are retained by the variables; and that helps in the compilation phase. So really, for this to be useful, you’d need types. That could also be part of the “compile” hint.
[>] Brian
On Mar 13, 2016, at 5:22 PM, Steve Fink <sphink at gmail.com> wrote:
This is a good time to bring up the other half of my original email because a number of other people have chimed in with their experiences with GC when attempting to develop more time critical applications without stutter.
I really don't think you want a System.gc() call for that. What if you call that after you're placed in a background tab, when you're sharing the JS heap with a latency-sensitive foreground tab? Do you really want to stutter the foreground tab for (up to) a few seconds? Probably not, in which case the name System.gc() would be a lie.
I think the closest you would want to go is to offer hints to the runtime. AppIsIdleStartingNowForAtLeast(500)? IDoNotMindIfYouDontCallMeAgainFor(500)? (units are milliseconds). The runtime can ignore it, or it can schedule an up-to-500ms incremental GC slice, or whatever. That does not negate all of the issues Boris was referring to, but IMHO it's a reasonable middle ground. We could double-check it against pending timers or whatever.
System.gc() would have a callback; it would block until you regained front status. That has some edge cases, but that’s something the programmer would have to be aware of.
// my crazy code that makes a lot of objects that fall out of scope System.gc(startNonCrazyCode); // my non-crazy code which allocates next to nothing to reduce GCs to almost nil …startNonCrazyCode(…) {}
The second part was a hint to tell the engine to always take the most aggressive route with optimization; for instance, in Safari’s engine, as I remember, I think there are three levels, interpreted, a half-n-half solution, and an actual full compile of the code. This hint would say “always compile to native” or if an engine never goes that far, always compile to the VM soup (though I suspect at this point most engines can do a native version.)
That's a nice simple mental model, but it's inaccurate in some important ways.
This would be used, again, for trading time in one place for time in another. A longer start-up time is something you want for a thing that will run continually and will almost always be guaranteed to fall into the compile path eventually. It’s not something you’d want for javascript that runs on a normal button click to post a form.
But you may need to run it in a slower mode 1 or more times in order for that native compilation to be effective. The fastest JIT levels compile the code under certain simplifying assumptions to make the generated code more efficient -- it may assume that you're not going to randomly add or delete a property from an object handled by the compiled code, for example, so it can compile in direct offsets to the object's slots. If you violate one of those assumptions, the compiled code will need to be discarded and re-generated with a new weaker set of assumptions.
So you don't want to go straight to the highest optimization level, as it might then not optimize as highly. You need to first run in a slower mode, perhaps even one that pays some overhead in order to gather information about the types and control paths actually used. And that's the next gotcha -- it's reasonable to skip collecting that info on the first time (or few times?) you run the code, because the majority of code is only run once so any gathered profiling info is useless.
Which is not to say that there's nothing useful you can tell the system. If you hinted to it that something was going to be important and run frequently, then it could choose to gather profiling information earlier and additionally lower the threshold for jumping up optimization levels. As with the GC case, though, you do not want to tell the VM exactly what to do. The best approach might be something like: spawn off a background compile, and in the meantime interpret the code without gathering profiling info, then switch to the compiled profiling code when it's ready, and then do an optimized compile based on that information as soon as you've observed "enough". The user code has no way to know how long that background compilation will take, whether there are spare resources to do it in the background, etc. And it varies by platform. So at best, I think you can drop hints like "this code is going to run a lot, and I'm pretty sure its runtime matters to me." In practice, people will end up cutting & pasting the magic bits that make things fast from stackoverflow and misapplying them all of the place, to the extent that engines might end up just completely ignoring them, but it may also turn out that they're a good enough signal that engines will pay attention. I don't know; I can't predict.
Being compiled gives you another benefit, say you have this class:
class …. { test() { let x,y,z; … …. return(x); }
The locals y and z are never move beyond the function scope; in this manner they could be stack based variables that never touch the heap (yes, I know that’s probably a very difficult implementation.) Or even variables that fall into a special section of the heap that only deals with locals that never scope outside the function and are always cleaned up automatically on function exit (basically, reference counting where you know the reference is always 0.) I'm a little confused about variables vs values here, but simply storing a value into a local does not guarantee that the value is only references by that variable. You can sometimes do an escape analysis to figure this sort of thing out, and I believe many engines do this or are working on doing this. But either way, requesting that something be compiled is absolutely the wrong signal -- analysis is separate from compilation, and in fact information necessary for an analysis is less likely to be collected at the highest optimization levels. (It would either be figured out when initially compiling to bytecode or whatever, or dynamically gathered during early or mid-tier optimization levels.)
This will basically reduce the amount of GCs by a good bit. NOW, you’d also have to track any additional function calls that use these variables (to make sure they aren’t save out to a global.) So everything underneath would be a force compile for this to work. Engines might already do this or stuff like it, and if so, great!
They do stuff like this, but it's independent of (and complicated by) compilation.
This is probably something that is much more complicated to implement that most of us might realize, but it’s a thought. Just being able to say “this class is always compiled” would be nice.
I don't think so. It's not even really the right level of granularity. Some class methods should be compiled while others are interpreted, for the lowest overall time.
The second part of this is native primitive types; having int/etc means they can be passed by value which means these checks are easier, but that’s probably something others have argued back and forth for a long time :)
The dynamic profiling can figure out whether in practice particular values are always int/etc while certain code executes, and compile with that assumption. Declaring types can give that a head start, but it'll still need to be double-checked when not easily statically provable, and may end up just wasting time prematurely compiling code with incorrect assumptions. JS right now is simply too dynamic to prevent all possibility of strange things happening to seemingly simple values. Besides, just observing the types seems to work pretty well in practice. The main benefit of type declarations would be in preventing errors via type checking, IMO.
Right, I understand all that, and to me, that would be part of compilation. If a lower level pass through, or if a number of them, is required, than that’ll will have to be done (before types.) If there are levels before full compilation that can be skipped, then that’s all this hint would ask for.
I have to apologize because I think people keep thinking I’m asking for something that solves a specific problem in a specific way; what I’m asking for is something that is more contactually simpler … “X always gets the most aggressive compilation”. If that takes multiple slow passes, then that’s fine. It’s not “NO slow passes” it’s “always strive to maximize speed over start up time."
I can tell you in my case (www.klinksoftware.com/ws, www.klinksoftware.com/ws) there isn’t anything that won’t benefit from every class (and it’s all classes) being compiled.
[>] Brian
On 3/13/16 5:30 PM, Brian Barnes wrote:
“Perfect is the enemy of the good.”
This is true, but...
I’m asking for a specific, useful function, not the perfect solution to my problem.
The problem is that the function you're asking for can be really really bad in some situations. So if we introduce it without actually defining its behavior we run the very real risk of people using it in ways that work in some current browsers and thus constraining the future development of those browsers and breaking other current browsers.
Note that I'm quite interested in helping solve the actual problem you're having. I just don't think this is the right solution for it. Some of the suggestions Steve Fink made are much better, in my opinion.
The problems start when the best practice is different for different engines and people optimize their code based on only one engine.
That’s not our problem
It's my problem as an engine developer.
and that’s not a solvable problem.
It's not a solvable problem in the abstract, sure. But it's a problem that we can make worse or better (e.g. by not explicitly introducing APIs that make the problem worse, or by introducing ones that make it smaller).
Now I will claim that the perfect is the enemy of the good. ;)
Every engine will try to get ahead by some manner, and something like compiling, or GC, which isn’t spec nor should it be, will always be evolving.
Yes! And we don't want to prevent that.
A System.gc() will in no way constrain further development.
I disagree most vehemently.
It’s not a guarantee to do anything but “free up objects that are no longer reachable and compact the heap.”
Except you want to use it as a guarantee of "no more GC pauses after this point". That's the use case it's being proposed for!
The contract is not “make my code run faster.” It’s not “don’t do it again in a little bit.” It’s not any of those things.
People will 100% surely assume that's the contract. Especially if it happens to be true in one particular case they use it in, in the one browser they test in.
And then they will create things that depend on that behavior, and it will in fact become a de-facto contract. I've seen this play out several times now.
I think you’re trying to force the API to be a problem solver
No, I think the right approach for API design is to figure out what problems you're trying to solve and then create APIs that ideally solve those and don't introduce new problems.
This is as opposed to coming up with an API and then trying to argue that it solves some specific problems therefore is the right API to have. There may be better solutions for those problems!
Small contract: Free up objects that are no longer reachable and compact the heap, the objects being objects that aren’t attached by outside APIs (DOM, webgl, web audio, etc.)”
That's not a useful contact, because no one actually cares in practice about whether this has happened or not. The real use cases mentioned so far are all about GC pauses, not internal details like whether things have been freed up or whatnot (which aren't even observable!).
On Sun, Mar 13, 2016 at 10:18 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
Adding some more flexible ways to deal with GC'ing beyond "just making
it better" would be highly welcome.
Yes, this I agree on.
Maybe some kind of API to 1) inform the GC about what strategy of GC'ing you would prefer and 2) indicate to the GC when's a good time to fulfill that strategy. For instance, for realtime rendering you'll want to spend say a maximum of 4ms/frame on GC'ing, you'd indicate gc.setStrategy('realtime'); and then at the end of a frame gc.now(4);
Boris, I think we’re going in a couple circles here, so let’s not waste anymore time (though it’s a fascinating discussion.) I think you’re complaints come down to:
- Potential mis-use of API could make things slower
- People assuming things could freeze behavior
- People will assume behavior outside of the spec
- What does GC mean? People will assume everything.
Which I can understand your position, though I disagree, but that’s more philosophical, and not something we can probably debate down. And I apologize if I mis-represent your argument here, but that’s the gist I get.
So let’s move it a little. How about this concept:
A call that hints that next time control is released from a script, that it triggers whatever the next type of GC, whatever it is, of any of the methods, that would get triggered close to this time. So:
// lots of code System.stageMaybeGC(); setTimeout(blech,X); // not necessarily this, but just anything that will release control
blech() {} // maybe you got a GC
Now, it might be a minor GC, but I can stage these after every complex operation that makes a lot of objects. At this point, I’m not interrupting or pausing or doing GCs the system wouldn’t do in it’s normal operation. The system can also say “hey, a GC here would be bad, I’m not going to do it” and skip it. Or the system can go, “hey, I do have a lot of stuff, and the programmer knows this is a good time."
Control would be required to be released. It’s maybe, and whatever it does is what the engine decides to do. And you can’t misuse it because it forces you to release control, so nobody can use it in a tight loop (well, they could, but it would be super, duper dumb.)
Thoughts?
[>] Brian
That sounds like a good idea, being able to hint to the browser when is a good point to GC, and also when to avoid GC if possible.
Idea: Maybe we have an animation that last 500 ms. Before the
animation, we can call System.preventGC()
, and the browser can
prevent GC for as long as possible, but will still do it if absolutely
necessary (i.e. if memory needs to be freed to allocate more). Then,
after that animation finishes, a call to System.attemptGC()
could
signal the browser to go ahead if it hasn't already. This ability to
hint could be very handy. It's like requestAnimationFrame: it is used
to hint at the browser that some code should be run between frames,
but it doesn't guarantee that the code won't cause jitter if the code
exceeds time limits, it's just a hint. We could use some way to hint
at GC too.
Another Idea: what about a directive like "use scope-gc", that when placed in a function GCs the function's scope as soon as that scope is gone (as soon as the function has returned)? That would guarantee that GC happens after functions complete, on a per-function basis if included in every function, and it would encourage re-use of allocated things like Florian mentioned for things like animation. Also, the function's return value wouldn't be returned until GC of the function's scope is complete. This would allow GC to be timed inside animation loops with performance.now(). It would be a whole new GC implementation only used in functions that have the directive. It's like telling the browser "hey, I'm done with this scope, GC it right now before returning the function's return value". This would mean that the style of programming used in a function would be directly related to the amount of GC time for that function. I don't know the implementation details of current GC, but I can completely imagine that it's possible, and that it might be useful due to the fact that GC si guaranteed and we know exactly when it happens (at the end of each function execution). The overall effect, if "use scope-gc" is used in every single function of an app, would be that GC will be event spread across all function calls, instead of at arbitrary times between many function calls.
Maybe a combination of hinting APIs, and that scope-gc directive, could lead to better overall control for important things like animation. For other things (where the user doesn't notice) it doesn't really matter so much. Something so simple as "jank" is what causes a user to dislike an app compared to another without jank, and hence the "native" vs "web" (with "native" winning) discussions.
TLDR: If there's absolutely anything that we can do to improve animation performance, please, let's do it!
On Mar 13, 2016, at 11:21 PM, /#!/JoePea <joe at trusktr.io> wrote:
That sounds like a good idea, being able to hint to the browser when is a good point to GC, and also when to avoid GC if possible.
Idea: Maybe we have an animation that last 500 ms. Before the animation, we can call
System.preventGC()
, and the browser can prevent GC for as long as possible, but will still do it if absolutely necessary (i.e. if memory needs to be freed to allocate more). Then, after that animation finishes, a call toSystem.attemptGC()
could signal the browser to go ahead if it hasn't already. This ability to hint could be very handy. It's like requestAnimationFrame: it is used to hint at the browser that some code should be run between frames, but it doesn't guarantee that the code won't cause jitter if the code exceeds time limits, it's just a hint. We could use some way to hint at GC too.
I think if this problem is at all solvable, it’s going to require a lot of methods, to cover a lot of different situations, and each one does something minor as to not radically change the engines ability to manage it’s own memory.
I like the above, but worried that things could return without calling attemptGC(), maybe something scoped to a function? Or even maybe requestAnimationFrame itself could have some kind of flag that says “try to not GC unless you really have to.” Regardless, something like is needed to.
The ultimate solution — because it’s a GC system — is to not allocate anything. That makes messy, spaghetti code, though.
[>] Brian
but worried that things could return without calling attemptGC(),
That's what the "use scoped-gc" directive would guarantee, it would be like calling attemptGC() (but forceGC), at the end of each function.
That makes messy, spaghetti code, though.
Ugly indeed, like using globals.
On Sat, Mar 12, 2016 at 6:27 AM, Brian Barnes <ggadwa at charter.net> wrote:
Request 1: Add a System module, and one call I need System.gc();
I do not think this is wise to make such calls, at least not the same way they are implemented in the shell.
Currently the shells have a global gc function, which is blocking the execution. Blocking the execution means that your JavaScript might prevent the browser from rendering the next frame. So if such global function is added, this would at least be asynchronous.
As you mention, this can be mis-used, a simple example would be:
for (var x of array) System.gc();
From what I think such function would be better as an hint for triggering GCs.
I want to note that currently, GCs are triggered based on the allocations, this gc() function highlight one kind of information which is usually available in static languages. In static languages, we have a "new" and a "delete" function, maybe it would make sense to sense to have a similar function as an hint for potentially unused memory.
System.unused(obj); obj = null;
In which case, this would be a good hint, as the engine would now have the potential for making estimates on the amount of memory which can be reclaimed. Thus deciding or not to compact the heap.
I also think that this could be used as an assertion for developers, in which case the engine can later warn the developer that objects which were supposed to be unused are still alive when the GC is executed.
The more we discuss this, the more I think this problem isn't solvable without something radical that makes Javascript more C like. Which, I think, is probably some of the reason for asm.js.
The problem: People want to create realtime games, VR, animations, without stutter. You can get away with this by pre-allocating everything into global (I do a lot of that and get solid, predictable frame rates: www.klinksoftware.com/ws) GC engines just aren't the place for that.
A real multi-part solution would be:
- Introduce types. Make them pass by value, and "set by value". This allows local variables to be on the stack, and allows for faster to compilation steps as you don't have to run functions to analyze the types.
foo(x) { int y; // local, on stack
foo2(y); // the y inside of foo2 is a local in foo2, passed by value globalY=y; // this is also copying the value, y is still local
y=0.0; // probably an error, should be forced to convert y=Math.trunc(0.0); // not an error
return(y); // also a copying of the value // y is popped from the stack }
This isn't new, it's how C obviously does it.
- Introduce a new mode that changes the behavior of new in scripts to require a free, say "use free"; . Heaps are compacted (if required) on new, so realtime can avoid that by pre-allocating.
{ let x=new myClass(y); free(x); }
This would probably require a separate heap, away from the GC heap.
I'm just thinking that anytime spent on a solution that tries to put some kind of control over GC is destined to meet resistance because of the variety of GC schemes and worse, never actually solve the problem, just move it around.
Yes, this is something really radical that would be years in showing up in any browser. I think for now I'll just have to stick to a lot of pre-allocated globals.
[>] Brian
I think introducing such manual memory collection would be much harder than standardizing some kind of foreign function interface and just connecting JavaScript to language that is better at dealing with specific task (eg. WebAssembly).
2016-03-14 14:35 GMT+01:00 Brian Barnes <ggadwa at charter.net>:
On 3/13/16 10:43 PM, Brian Barnes wrote:
- Potential mis-use of API could make things slower
- People assuming things could freeze behavior
- People will assume behavior outside of the spec
- What does GC mean? People will assume everything.
I think #1 there is a mechanism of #2, in that freezing behavior will happen because what used to be good use of the API would become slow misuse and browsers would therefore not make the corresponding change.
But yes, this seems like a reasonable summary.
A call that hints that next time control is released from a script, that it triggers whatever the next type of GC, whatever it is, of any of the methods, that would get triggered close to this time.
I think it's a good idea to not imply this API is specifically about GC. I would much prefer we make it about what we actually care about: telling the engine that you have time right now for it to do some work, but won't have time later. This captures the semantics we really care about, right?
Apart from that nit (which is basically about naming), this seems reasonable to me as a thing to have.
I think that's a good, workable solution, and would help with my specific problem.
[>] Brian
On 03/13/2016 02:50 PM, Brian Barnes wrote:
On Mar 13, 2016, at 5:22 PM, Steve Fink <sphink at gmail.com <mailto:sphink at gmail.com>> wrote:
This is a good time to bring up the other half of my original email because a number of other people have chimed in with their experiences with GC when attempting to develop more time critical applications without stutter.
I really don't think you want a System.gc() call for that. What if you call that after you're placed in a background tab, when you're sharing the JS heap with a latency-sensitive foreground tab? Do you really want to stutter the foreground tab for (up to) a few seconds? Probably not, in which case the name System.gc() would be a lie.
I think the closest you would want to go is to offer hints to the runtime. AppIsIdleStartingNowForAtLeast(500)? IDoNotMindIfYouDontCallMeAgainFor(500)? (units are milliseconds). The runtime can ignore it, or it can schedule an up-to-500ms incremental GC slice, or whatever. That does not negate all of the issues Boris was referring to, but IMHO it's a reasonable middle ground. We could double-check it against pending timers or whatever.
System.gc() would have a callback; it would block until you regained front status. That has some edge cases, but that’s something the programmer would have to be aware of.
That was one example off the top of my head, which as you say can be resolved if specifically addressed, and it still spawns edge cases. There are sure to be many other problematic cases, and if you don't handle all of them and their edge cases, then by implementing this we are likely to paint ourselves into a corner -- you use it, everything works fine until some other engine or user code optimization exposes edge case X, but we can't fix that without breaking other users.
Not to mention that your specific workaround make GC observable, which is an information leak making it possible to depend on GC scheduling, which means making it possible to prevent engines' GC behavior from changing.
In a specific embedding, what you're asking for is reasonable. I would even go so far as saying that it might be good to reserve some chunk of the namespace for hints or whatever (though perhaps Symbols make that unnecessary.) If you are in an embedding that ships with a specific version of a JS engine and doesn't need to share anything with other things running JS, then it's fine to give user control over GC scheduling, manipulate bare pointers, suppress type guards and operate blindly on known types, generate machine code and run it, or whatever else your application desires. But you're not going to get stuff like that added to the spec for a language that needs to work in environments with evolving JS engines running frozen shipped JS code or hostile code sharing the same VM.
For all I know, es may start carving out parts of the spec that only apply to "privileged contexts". I haven't been following it, but I could imagine such a thing might be needed for SharedArrayBuffers and related things. Heck, browsers have fullscreen modes, where I am free to mock up your bank site's login page. But privileged stuff will very careful handling to avoid hobbling ourselves with long-term evolution/compatibility hazards and security problems.
The second part of this is native primitive types; having int/etc means they can be passed by value which means these checks are easier, but that’s probably something others have argued back and forth for a long time :)
The dynamic profiling can figure out whether in practice particular values are always int/etc while certain code executes, and compile with that assumption. Declaring types can give that a head start, but it'll still need to be double-checked when not easily statically provable, and may end up just wasting time prematurely compiling code with incorrect assumptions. JS right now is simply too dynamic to prevent all possibility of strange things happening to seemingly simple values. Besides, just observing the types seems to work pretty well in practice. The main benefit of type declarations would be in preventing errors via type checking, IMO.
Right, I understand all that, and to me, that would be part of compilation. If a lower level pass through, or if a number of them, is required, than that’ll will have to be done (before types.) If there are levels before full compilation that can be skipped, then that’s all this hint would ask for.
I have to apologize because I think people keep thinking I’m asking for something that solves a specific problem in a specific way; what I’m asking for is something that is more contactually simpler … “X always gets the most aggressive compilation”. If that takes multiple slow passes, then that’s fine. It’s not “NO slow passes” it’s “always strive to maximize speed over start up time."
Well, "the most aggressive compilation" is undecidable in general, since you can't predict the future to know if some future execution won't invalidate this or that constraint. But sure, I think we're basically agreeing, except that I insist it'll only work if you phrase things as hints about things you know about your code (such as that startup is now done and the honeymoon period for latency is over), not as directives to do this or that. Engines have to be free to ignore anything you say, and to choose the right thing to do in response to a hint (which may be very different from what you'd guess from the outside), and even then we can shoot ourselves in the foot if anyone starts depending on things.
I can tell you in my case (www.klinksoftware.com/ws) there isn’t anything that won’t benefit from every class (and it’s all classes) being compiled.
I have not looked at your case, so I can't speak to that. But I can tell you that most of the time, when I dig into a particular chunk of JS code, it ends up executing very differently from how you'd expect. "Compiling everything" is rarely the globally optimal solution, although in your case the downsides may be irrelevant to you. (Slower startup, higher memory usage.)
On 03/14/2016 06:35 AM, Brian Barnes wrote:
The more we discuss this, the more I think this problem isn't solvable without something radical that makes Javascript more C like. Which, I think, is probably some of the reason for asm.js.
The problem: People want to create realtime games, VR, animations, without stutter. You can get away with this by pre-allocating everything into global (I do a lot of that and get solid, predictable frame rates: www.klinksoftware.com/ws) GC engines just aren't the place for that.
A real multi-part solution would be:
- Introduce types. Make them pass by value, and "set by value". This allows local variables to be on the stack, and allows for faster to compilation steps as you don't have to run functions to analyze the types.
foo(x) { int y; // local, on stack
foo2(y); // the y inside of foo2 is a local in foo2, passed by value globalY=y; // this is also copying the value, y is still local
y=0.0; // probably an error, should be forced to convert y=Math.trunc(0.0); // not an error
return(y); // also a copying of the value // y is popped from the stack }
This isn't new, it's how C obviously does it.
Your example is of primitive types, so I will only address that. You are asserting that this would help performance. I assert that the existing dynamic type-discovery mechanisms get you almost all of the performance benefit already (and note that I'm talking about overall performance
On Mar 14, 2016, at 12:52 PM, Steve Fink <sphink at gmail.com> wrote:
On 03/14/2016 06:35 AM, Brian Barnes wrote:
The more we discuss this, the more I think this problem isn't solvable without something radical that makes Javascript more C like. Which, I think, is probably some of the reason for asm.js.
The problem: People want to create realtime games, VR, animations, without stutter. You can get away with this by pre-allocating everything into global (I do a lot of that and get solid, predictable frame rates: www.klinksoftware.com/ws) GC engines just aren't the place for that.
A real multi-part solution would be:
- Introduce types. Make them pass by value, and "set by value". This allows local variables to be on the stack, and allows for faster to compilation steps as you don't have to run functions to analyze the types.
foo(x) { int y; // local, on stack
foo2(y); // the y inside of foo2 is a local in foo2, passed by value globalY=y; // this is also copying the value, y is still local
y=0.0; // probably an error, should be forced to convert y=Math.trunc(0.0); // not an error
return(y); // also a copying of the value // y is popped from the stack }
This isn't new, it's how C obviously does it.
Your example is of primitive types, so I will only address that. You are asserting that this would help performance. I assert that the existing dynamic type-discovery mechanisms get you almost all of the performance benefit already (and note that I'm talking about overall performance -- code that runs 1 million times matters far, far more than code that runs a handful of times during startup.) And they work for more situations, since they capture types of unnamed temporaries and cases where the type is not statically fixed but during actual execution either never changes or stops changing after a startup period. And they do not require the programmer to get things right.
Do you have evidence for your assertion? (I haven't provided evidence for mine, but you're the one proposing a change.)
I didn’t assert that it would help performance, I said it would reduce GC load, because primitives could be put on the stack, and popped off, and never hit the heap, and are never considered in the GC. Less objects in the GC could even eliminate GC if you pre-allocate and just have local primitives, because you never create an object yourself.
If there is no GC, performance will improve, that’s just logical. Performance was never the concern, pauses due to GC were.
- Introduce a new mode that changes the behavior of new in scripts to require a free, say "use free"; . Heaps are compacted (if required) on new, so realtime can avoid that by pre-allocating.
{ let x=new myClass(y); free(x); }
This would probably require a separate heap, away from the GC heap.
I'm just thinking that anytime spent on a solution that tries to put some kind of control over GC is destined to meet resistance because of the variety of GC schemes and worse, never actually solve the problem, just move it around.
Please don't ignore resistance just because it's getting in the way of what you want. Your needs are real, it's just that there's a huge constellation of issues that are not immediately obvious in the context of individual problems. Language design is all about threading the needle, finding ways to address at least some of the issues without suffering from longer term side effects.
Oh, I’m not, I’m just accepting reality because of the concerns, which is why I’m spitballing other ideas. It’s that I completely understand the position of others, and trying to think about other solutions.
Yes, this is something really radical that would be years in showing up in any browser. I think for now I'll just have to stick to a lot of pre-allocated globals.
As asm.js shows, you have this already. Your heap is an ArrayBuffer, and cannot point to GC things so it will be ignored by the GC. You "just" have to manage the space manually. (And you can emscripten-compile a malloc library to help out, if you like.)
Typed Objects should go a long way towards making this nicer to use from hand-written JS.
I do that now. Everything except locals are globals within my classes, so I never allocate anything other than numbers or strings; eventually those will get me, too, but they are quick GC objects.
I keep posting this because I want people to take a look :) www.klinksoftware.com/ws/, www.klinksoftware.com/ws You can see what I did and get the code from GitHub itself. I actually don’t have any GCs in there that stop anything, but I know the trouble I went through.
I’d rather not use asm.js, which isn't hand-written code. I, and I suspect a lot of others, really want to do more real-time code. It’s just that the GC gets in the way, sometimes. What I’m doing is more of an experiment to see how far I can push JS. Pretty far, actually.
[>] Brian
On Mar 14, 2016, at 12:28 PM, Steve Fink <sphink at gmail.com> wrote:
Well, "the most aggressive compilation" is undecidable in general, since you can't predict the future to know if some future execution won't invalidate this or that constraint. But sure, I think we're basically agreeing, except that I insist it'll only work if you phrase things as hints about things you know about your code (such as that startup is now done and the honeymoon period for latency is over), not as directives to do this or that. Engines have to be free to ignore anything you say, and to choose the right thing to do in response to a hint (which may be very different from what you'd guess from the outside), and even then we can shoot ourselves in the foot if anyone starts depending on things.
Let me apologize that I am not exactly the clearest on explaining myself sometimes. Anything I suggest I never expect to always work as engines involve, or mean the same thing, day to day, so hints is a much better term, that I’m perfectly fine with.
I do know that, obviously, I’m attempting to push a language never meant to be one way towards another language, C. There’s some naivety in this.
[>] Brian
For the last couple months I’ve been working on a “how doable is it” project — it’s a complete 3D shooter in Javascript — but where every piece of it — every pixel in every texture, every tangent space, every polygon, and even every sound — is dynamically generated.
I have an early alpha. You need a very modern browser to run it (FF, Chrome, Safari do it best.) It takes a while to start up (it’s building light maps, mostly) but it should run well. You can see it here, all the code is up on Github.
www.klinksoftware.com/ws, www.klinksoftware.com/ws
Take a look, because it demonstrates my requests. I know my requests are very programmer-y but I thought I’d thrown them out there.
Request 1: Add a System module, and one call I need System.gc();
Why: My game works in two parts; one part creates all the content, and then the second part plays it. In the second part, I do as much as I can to precreate/re-use any objects or arrays I need, to reduce GC. On the creation part, I don’t as much. So being able to force a GC at a point helps me, because it stops the creation part from forcing a pause in the game part.
Cons: Could be mis-used. GC would have to pause execution. Need modules to be implemented.
Request 2: Add hints to classes.
Why: Modern Javascript engines usually use different strategies for running code depending on certain events — for instance, function x has been run too many times. I’d like a hint to say “always uses the best method.” Or at the module level, though that might already happen.
Cons: Again, mis-use, and really technical.
Request 3: Static consts in classes
Why/Cons: I’m sure this has already been discussed :)
I’m pretty happy with what I’ve been able to accomplish, so I don’t have many complaints, other than the major browser makers need to implement modules.
[>] Brian