Early error vs. error on first call to function vs. runtime error
As a user, not implementer, I really want early errors. Perf costs of startup are negligible especially long-term. By the time ES6 is in browsers computers and phones should be faster by enough of a factor to mitigate any costs, whereas omitting early errors hurts developers indefinitely into the future.
Domenic Denicola wrote:
As a user, not implementer, I really want early errors. Perf costs of startup are negligible especially long-term. By the time ES6 is in browsers computers and phones should be faster by enough of a factor to mitigate any costs, whereas omitting early errors hurts developers indefinitely into the future.
Totally agree!
Others
I'm generally against error on first call -- in the general case if you're able to determine a function should fail on first execution you can determine that it could fail during parsing and semantic analysis.
Oliver Hunt wrote:
I'm generally against error on first call -- in the general case if you're able to determine a function should fail on first execution you can determine that it could fail during parsing and semantic analysis.
Ok, to play fair I should ask how you feel about any analysis that is not "cheap enough" to do during a parse or lex/read pass required to lazily compile functions on first call. What about binding analysis, recognizing every statically resolvable use of a definition, possibly making free variable uses early errors when inside module {...}?
On Sep 27, 2012, at 10:54 PM, Brendan Eich <brendan at mozilla.org> wrote:
Oliver Hunt wrote:
I'm generally against error on first call -- in the general case if you're able to determine a function should fail on first execution you can determine that it could fail during parsing and semantic analysis.
Ok, to play fair I should ask how you feel about any analysis that is not "cheap enough" to do during a parse or lex/read pass required to lazily compile functions on first call. What about binding analysis, recognizing every statically resolvable use of a definition, possibly making free variable uses early errors when inside module {...}?
I am still not sure whether modules are expected to have a containing scope, but either way noticing free var references during parsing in not a problem (why would it be?)
the reason I didn't say "all cases" is because resolution of an unbound global property obviously can't be determined during parsing, it also doesn't make sense to fail before the access (ignoring backwards compat issues entirely here the obviate this problem).
I'm not sure what classes of error would warrant exception on initial execution vs. just failing on sema, could you give an example?
Oliver Hunt wrote:
On Sep 27, 2012, at 10:54 PM, Brendan Eich<brendan at mozilla.org> wrote:
Oliver Hunt wrote:
I'm generally against error on first call -- in the general case if you're able to determine a function should fail on first execution you can determine that it could fail during parsing and semantic analysis. Ok, to play fair I should ask how you feel about any analysis that is not "cheap enough" to do during a parse or lex/read pass required to lazily compile functions on first call. What about binding analysis, recognizing every statically resolvable use of a definition, possibly making free variable uses early errors when inside module {...}?
I am still not sure whether modules are expected to have a containing scope,
Yes, the global object and (we agreed last week) its lexical contour for let/const/class/module bindings.
Note also that the design wants to allow modules to nest in top level blocks, so outer blocks' let/const/function bindings could be in a module's scope.
Then the solution (from the 1JS era early this year) for free variable errors would make any properties of the global object already bound when the modulel is parsed be bound in the module's scope. So you wouldn't get a free variable error for 'alert', or the many other DOM bindings and global scope pollutions.
but either way noticing free var references during parsing in not a problem (why would it be?)
Because it requires use/def analysis, let's call it "definite binding analysis". Just to parse JS, one need not do any such analysis.
the reason I didn't say "all cases" is because resolution of an unbound global property obviously can't be determined during parsing, it also doesn't make sense to fail before the access (ignoring backwards compat issues entirely here the obviate this problem).
In a module {...} there wouldn't be any such backward compatibility problem.
It would make sense to fail early if the cause of the free variable not matching anything in the global object at the time the module is parsed is that you've typo'ed the name! This is quite common.
You're right it could break extant code that migrates into a module. One can make a global by assignment in non-strict code, and this could be hidden from any analysis.
There's a trade-off, but catching free variable typos seems important.
I'm not sure what classes of error would warrant exception on initial execution vs. just failing on sema, could you give an example?
ES5 defines a number of normative early errors in Clause 16. We've just been discussing duplicate formal parameters in a strict function's head, e.g.
On Sep 27, 2012, at 10:54 PM, Brendan Eich wrote:
Oliver Hunt wrote:
I'm generally against error on first call -- in the general case if you're able to determine a function should fail on first execution you can determine that it could fail during parsing and semantic analysis.
Ok, to play fair I should ask how you feel about any analysis that is not "cheap enough" to do during a parse or lex/read pass required to lazily compile functions on first call. What about binding analysis, recognizing every statically resolvable use of a definition, possibly making free variable uses early errors when inside module {...}?
In particular, this general concern was raised from the perspective of an implementation that is trying to absolutely minimize semantic analysis during the initial parse phase. Presumably it does not generate an AST at that time which makes it difficult to do any static analysis that isn't extremely local. Also, the concern was not specifically about this set of argument rules but was more general about the growing set of early errors that we currently have in the ES6 spec. Individually most early errors are probably not particularly burdensome to such an implementation. But collectively, they may well be.
Implementation diversity is a good thing and we should be careful in our specifications to not unnecessarily limit such diversity. Prior to ES5 there were very few "early errors" (the terminology wasn't even formalized) and early reporting of them was optional. ES5 added quite a few new ones (particularly related to strict mode) and made early reporting mandatory. ES6 is on course to add many new early errors. Before we go too far with this, it seems reasonable to stop and consider the possible impact upon implementation deversity of requiring early reporting of a large number of error conditions.
I propose we call this class of errors "deferred early errors". They are statically detectable errors that could be reported at "load time". What we are discussing is deferring their reporting (and implicitly their detection) to a point where it is definitively known that the erroneous code actually is used by the running program. To address, Domenic's concern a "deferred early error" is still required to be detectable solely from inspecting a source file prior to execution. If we have them, it would seem to be a best practice for any developer to run an ahead-of time linter over all program before deployment. That doesn't seem like a big burden and something that could easily be integrated into tool chains and development tools.
I see a potential benefit to implementor in the "deferred early error" concept and there are implementation alternatives that would be lost without it. I see a minor negative impact to developers. They would need to take an extra pre-deployment action in order to be sure that a ES source file does not violate any ES static semantic restrictions. To me this seems like a plausible cost/benefit trade-off.
Allen Wirfs-Brock wrote:
I see a potential benefit to implementor in the "deferred early error" concept and there are implementation alternatives that would be lost without it. I see a minor negative impact to developers. They would need to take an extra pre-deployment action in order to be sure that a ES source file does not violate any ES static semantic restrictions. To me this seems like a plausible cost/benefit trade-off.
I don't think this holds up. Modern JS wraps most top level code in
(function () { ... })();
or similar. That makes a bunch of ES6 "static" errors runtime errors.
Anyway, ES5 did what it did. That may be a place to stop adding early errors, but agreeing on this point does not automatically include the idea of making functions containing tardy static errors into invocation-fused explosive devices!
We should consider all the ES6 would-be/maybe-static errors, as decided at last week's meeting, and evaluate them one by one. Some may have no high analysis overhead.
On Sep 28, 2012, at 9:54 AM, Brendan Eich wrote:
Allen Wirfs-Brock wrote:
I see a potential benefit to implementor in the "deferred early error" concept and there are implementation alternatives that would be lost without it. I see a minor negative impact to developers. They would need to take an extra pre-deployment action in order to be sure that a ES source file does not violate any ES static semantic restrictions. To me this seems like a plausible cost/benefit trade-off.
I don't think this holds up. Modern JS wraps most top level code in
(function () { ... })();
or similar. That makes a bunch of ES6 "static" errors runtime errors.
Well, my contention is still that all ES error reporting is done at "runtime". Again, by "static error" I mean one that is detectable by inspection of the source code without any dependency upon runtime state. Wrapping a function within a otherwise empty anonymous function does not make such errors less "static"
Also, my understanding of what we are discussing would defer reporting to first invocation of the inner-most function containing the error. In the case of the above module pattern, only early errors at the top level of the anonymous function would be reported when the anonymous function was called.
Anyway, ES5 did what it did. That may be a place to stop adding early errors, but agreeing on this point does not automatically include the idea of making functions containing tardy static errors into invocation-fused explosive devices!
It isn't clear to me that there is much difference between an invocation-fused explosive function-level and a similarly fused script file or module. The triggering of each of these normally occurs at some point after some application code has executed to even if we are talking about ES5 "early errors" they are occuring at runtime and may be trigger by some sort of invocation-like action.
The size of the average JS program is increasing in size faster than machines are getting faster from where I sit. Adding startup overhead should be something TC39 is actively working against.
The "early errors/first run errors" can be caught by other tools as well (linters, in browser developer tools). If said if the checks can be implemented efficiently with near-zero overhead: great!
Allen Wirfs-Brock wrote:
It isn't clear to me that there is much difference between an invocation-fused explosive function-level and a similarly fused script file or module. The triggering of each of these normally occurs at some point after some application code has executed to even if we are talking about ES5 "early errors" they are occuring at runtime and may be trigger by some sort of invocation-like action.
No, there's a difference, Mark talked about it clearly at the meeting last week: the all-or-nothing early error property ensures that the heap is not littered with landmines. The poisoned function alternative is exactly that: a heap paved in mines.
John Lenz wrote:
The size of the average JS program is increasing in size faster than machines are getting faster from where I sit.
Ahem, gmail, cough. :-/
Adding startup overhead should be something TC39 is actively working against.
I take a different view, since as you say machines are getting faster. Not fast enough due to the power wall, so parallelism must be exploited safely.
More signfiicant: networks are getting faster and lower latency, and this shows up TCP's 90's era algorithms and default configury for cwnd etc. I expect changes at the transport layer to compensate, so suspect we'll be able to load more on demand.
Therefore I'm wagering that the problem ten years from now will be quite different from what you face today. Not just due to faster machines and 'nets, but based on those, due to better ways of packaging and deploying JS (as source still, but possibly with fast-load properties we don't do right now; something like Doloto [research.microsoft.com/apps/pubs/default.aspx?id=70518], maybe).
Closing doors now based on fear of Parkinson's-Law overgrowth in the future does not seem like a good trade to me. Better would be pay-by-the-feature, so developers can choose to opt into early checking, or not. Thus the idea of module {...} getting free variable errors based on references that do not resolve to any lexical binding or global property available at compile time.
The "early errors/first run errors" can be caught by other tools as well (linters, in browser developer tools). If said if the checks can be implemented efficiently with near-zero overhead: great!
Could we have best of both? If module {...} gets early error for unbound free variable but this is considered onerous, perhaps we could separate the opt-in gesture a bit. I don't want a pragma, though. A devtool may not see all the actual global properties, so may flag false positives.
Food for thought, but win-win wins. Can we have it all?
On Fri, Sep 28, 2012 at 11:14 AM, Brendan Eich <brendan at mozilla.org> wrote:
John Lenz wrote:
The size of the average JS program is increasing in size faster than machines are getting faster from where I sit.
Ahem, gmail, cough. :-/
Spreadsheets, document editor, photo editor, games, emcascript apps, GWT enterprise applications, etc, etc. People are trying to do big things in JavaScript and we should be trying to enable bigger things.
Adding startup overhead should be something TC39 is actively working
against.
I take a different view, since as you say machines are getting faster. Not fast enough due to the power wall, so parallelism must be exploited safely.
More signfiicant: networks are getting faster and lower latency, and this shows up TCP's 90's era algorithms and default configury for cwnd etc. I expect changes at the transport layer to compensate, so suspect we'll be able to load more on demand.
Mobile networks have significantly increased latency (radio startup time, etc) and networks are unreliable, but yes the best case scenarios are getting better. But predicting the future is something now one is really good at. Making sure we don't make thing worse today is more the point.
Therefore I'm wagering that the problem ten years from now will be quite different from what you face today. Not just due to faster machines and 'nets, but based on those, due to better ways of packaging and deploying JS (as source still, but possibly with fast-load properties we don't do right now; something like Doloto [research.microsoft.** com/apps/pubs/default.aspx?id=**70518research.microsoft.com/apps/pubs/default.aspx?id=70518], maybe).
The application I work with already do significant lazy loading of functionality. Doloto itself is a failed idea (late loading arbitrary functions on unreliable networks), load points need to be controlled, but this is besides the point.
The best thing I see for the future is if the Browser vendors didn't reparse JavaScript when loading from cache, this would help the best case scenarios, but doesn't help the worse case where it harder to find wins.
Closing doors now based on fear of Parkinson's-Law overgrowth in the future does not seem like a good trade to me. Better would be pay-by-the-feature, so developers can choose to opt into early checking, or not. Thus the idea of module {...} getting free variable errors based on references that do not resolve to any lexical binding or global property available at compile time.
The "early errors/first run errors" can be caught by other tools as well
(linters, in browser developer tools). If said if the checks can be implemented efficiently with near-zero overhead: great!
Could we have best of both? If module {...} gets early error for unbound free variable but this is considered onerous, perhaps we could separate the opt-in gesture a bit. I don't want a pragma, though. A devtool may not see all the actual global properties, so may flag false positives.
Opt-in, earlier checks would be good for development.
John Lenz wrote:
The best thing I see for the future is if the Browser vendors didn't reparse JavaScript when loading from cache, this would help the best case scenarios, but doesn't help the worse case where it harder to find wins.
Yes, why hasn't this happened? We had a Mozilla intern working on it, but I'm not sure where that went. It seems no major browser caches post-parsed tree serializations or other intermediate forms. Anyone know differently?
On Sep 28, 2012, at 20:58, "Brendan Eich" <brendan at mozilla.org> wrote:
John Lenz wrote:
The best thing I see for the future is if the Browser vendors didn't reparse JavaScript when loading from cache, this would help the best case scenarios, but doesn't help the worse case where it harder to find wins.
Yes, why hasn't this happened? We had a Mozilla intern working on it, but I'm not sure where that went. It seems no major browser caches post-parsed tree serializations or other intermediate forms. Anyone know differently?
/be
A pretty niche example, but: Windows 8 Store apps written in HTML5/JS are “bytecode cached” at package time, i.e. downloading an app from the store gives you bytecode instead of (or, in addition to?) pure JS.
Details at msdn.microsoft.com/en-us/library/windows/apps/hh849088.aspx
- Error on first call to a function, where the function contains what would be an early error but for the supposed cost of early error analysis.
As I understand it, this goes back to "lazy parsing"
ariya.ofilabs.com/2012/07/lazy-parsing-in-javascript-engines.html
which, in turn, seems to want to support the found-in-practice pattern of pre-loading not-yet-parsed code (eg, Google's code in comments trick).
developers.google.com/speed/docs/best-practices/mobile, googlecode.blogspot.de/2009/09/gmail-for-mobile-html5-series-reducing.html
The reason why that pays off is that sites tend to include lots of code that they usually do not use, or for which they don't know beforehand whether it will be used. In some cases, the balance seems to be between utilization of HTTP requests versus parsing of JS code.
So there are use cases (though some of them are addressed by newer element attributes), but there should be other, more direct, ways to achieve the goal:
-
coders don't include code that isn't used (doesn't seem to work, but is still worth mentioning)
-
the language or its environment include explicit means for including code for possible (but not certain) use (similar to <script defer>, one might consider a <script provide> with parse-on-use semantics?)
In particular, I would like to know whether the bandwidth vs parsing balance is the only motivation for lazy parsing. If yes, then it is a web-specific issue, and the solution should not affect server-side JS.
Because there is a hidden assumption behind the lazy parsing idea, namely: the parsing -when it is triggered- will not fail.
If the parse correctness wasn't tested before serving the code, there is no easy way of catching parse failure later. So there would need to be a development mode and a serve mode?
If you give me a switch/pragma to force early parsing/checking, then I will use it in my code (another job for "use strict"?). And if you give us a way to provide code for parse-on-use, someone on the web will find that useful. But without such switch, I don't want my JS engine to do its few "early" checks later, perhaps, just to address real use cases badly and implicitly.
Claus
Claus Reinke wrote:
one might consider a <script provide> with parse-on-use semantics?)
This is a good idea. Instead of JS engines trying to do a cheap-yet-sound parse (or "read") of JS source, developers could say that they expect a <script> to be not frequently used enough to justify
non-defer and non-"provide" semantics (eager loading).
We do not want "developer mode" vs. "deploy mode" if it can be avoided. We want a way for developers to control latency hits, sorting JS into "hot" and "cold" paths at a coarse grain.
I'm against this entirely. I don't see any reason to delay semantic checks, especially given one of the major purposes of strict mode was to convert unnecessarily late errors into early errors.
I still don't understand this desire to delay semantic analysis, where are the examples of sema being a performance bottleneck? Just basic parsing already requires us to do a reasonable amount of analysis anyway, and while parsing shows up as being a problem, the bulk of that time seems to be lexing, and lexing is unavoidable even if all you want to do is syntactic analysis (unless we're proposing a delayed syntactic check mode?????)
one might consider a <script provide> with parse-on-use semantics
Sounds like yet another "use strict" to me. Experience tells that this is not working very well. Either an optimization is worth and is enforced by default, or it isn't used in most cases. Asking the developer to mark his code with tons of "attributes" to enable all the optimizations is cumbersome... At some point, we'll have to specify something like this to have an optimized code <script async lazy enforce-types enforce-tail-recursion ...>[[ "use strict;" ... ]]</script>.
Oliver Hunt wrote:
I still don't understand this desire to delay semantic analysis, where are the examples of sema being a performance bottleneck? Just basic parsing already requires us to do a reasonable amount of analysis anyway, and while parsing shows up as being a problem, the bulk of that time seems to be lexing, and lexing is unavoidable even if all you want to do is syntactic analysis (unless we're proposing a delayed syntactic check mode?????)
You make good points. The counter-arguments were from Microsoft at the last TC39 meeting (two weeks ago). The followup from them (and from all implementors) was to provide more detail from profiling on where the costs lie. If they're mostly irreducible lexing, then you're right.
Brendan Eich wrote:
This needs a separate thread. The idea from last week's TC39 meeting was to have not only
Early error, thrown before any code in the Program (grammar goal symbol) containing the error, required by specific language in Clause 16.
Runtime error, all the other kinds.
and now
The last case is really just a runtime error: a function with what should be a static error becomes a booby trap: if your tests happen to miss calling it, you'll feel ok, but a user who tickles the uncovered path will get a runtime error.
TC39 heard from some implementors who wanted to avoid more early error requirements in ES6, or at least any that require analysis, e.g. reaching definitions.
That's fair as input to the committee, but implementation concerns are not the only ones we weigh. And we were far from agreed on adding the "Error on first call" category.
The example you imply here would be
function f(a, b = c, a = d) { }
and the duplicate formal a would be illegal because of the novel default-parameter syntax.
Making f into a proximity-fused bomb does not see either good or necessary. The analysis requires to notice duplicate formals is trivial, and as I keep pointing out, ES5 already requires it:
function g(a, a) { "use strict"; }
This must be an early error per ES5 clause 16.
Given the ES5-strict sunk cost, there's no added implementation tax beyond the logic conjoining duplicate detection with novel-syntax detection, which is trivial.
It'd be good to hear from Luke on this.