Importing global into modules.
The problem this doesn't solve is the one Andreas Rossberg raised (at a past TC39 meeting, and just the other day here): browser-based implementations must parse lazily and super-fast, they cannot be presumed to be able to afford binding checks. Someone should go super-hack a binding checker that doesn't degrade page load perf, or other metrics.
Absent such evidence, there are two strikes against what you propose:
- performance effects of binding checking when parsing;
- whether async script loads make global bindings unordered and so not generally worth trying to check.
I'm probably not understanding all the issues here but I'm also not explaining my suggestion well either.
The way I see the two issues you raised is like this.
-
I think you mean that a parser wants to fail quickly if it comes across an identifier that was not previously declared as an import but that could be declared as one deeper in the module file? The simple solution to that problem is to not allow binding references before their import statements.
99.9% of modules will declare their imports at the top of the file anyway, most developers won't come across this restriction very often. It doesn't take away functionality, it's a useless degree of freedom and I can't think of any language where it's good practice to import/require etc anywhere but at the head of a module/class.
I imagine these fast parsers work top down as the packets are streamed in from the network so this rule should work well for them?
-
I didn't explain this part well. I meant for the "@global" import to be a special one who's bindings will not go through the same checks that other modules do. In a browser importing global would return window.
This should make it trivial to upgrade current code to ES6 modules while still allowing much stricter checks in modules. You would use tools that take the ES5 code and scan it for free identifiers and import them from "@global". This then allows you to gradually move them all into ES6 modules.
import * as global, {Date, XMLHttpRequest} from "@global";
All bindings provided by "@global" bypass the normal bindings checks.
Think of it as a reverse "use strict", currently to the do the right thing developers have to add an extra line at the top of all their scripts. With ES6 modules they should instead be required to add an extra line to do the wrong but temporarily necessary wrong thing.
Adding this extra line is highly amenable to automation/tooling which developers will be using anyway when moving to ES6 modules.
With time this line will be removed as their code is moved into modules or new versions of global classes are provided in standard modules. This would allow developers to still use "Date" as their constructor and not have it accidentally use the older version if a new one is added to standard modules.
So by default an ES6 module starts of with a totally clean slate, it has no external state in its scope. Only if the developer chooses will the module be exposed to the madness of the global scope.
Would this help macros or typing to be added to those modules with no access to the global? That would be a nice carrot to get developers to try and delete that line at the top of their module.
I like this idea mainly because it will allow us to create modules that explicitly define every used reference. This will help tooling detect exactly the source of every value we use.
That was one the intentions but I get the feeling that this change is either too late or far more complex then it seems.
In future we might have a "use strict" or "use macros" for modules which would enforce this clean slate for modules. It's not a big deal but I would have preferred the "use globals" approach and having modules clean by default instead.
We have over 3000 classes + tests and I can't imagine more then 10% need access to the global (to access non module third-party libraries). I feel large scale JS applications using modules will be very much like this. I'm currently in the process of an automatic conversion of these global scripts into CJS modules. ( briandipalma/global-compiler )
The conversion to clean ES6 modules from global scripts seems very amenable to tooling/automation.
Given the future is longer then the past I hope the reason for not considering this is not a deadline issue.
On 5 October 2014 17:51, Brian Di Palma <offler at gmail.com> wrote:
- I think you mean that a parser wants to fail quickly if it comes across an identifier that was not previously declared as an import but that could be declared as one deeper in the module file? The simple solution to that problem is to not allow binding references before their import statements.
No, fail fast is not the issue, nor is the recursive nature of scoping in JavaScript.
The issue is that, in order to know what identifiers are bound -- even in the simplest, linear case -- you need to maintain environments, i.e., do analysis and bookkeeping of declarations and scopes, both surrounding and local ones. So far, "pre-parsing" (the early approximate parse of lazily compiled functions, that only looks for early errors) didn't need anything like that. Adding it probably isn't overly expensive, but it's not free either, and nobody has measured the impact on large applications.
99.9% of modules will declare their imports at the top of the file anyway, most developers won't come across this restriction very often. It doesn't take away functionality, it's a useless degree of freedom and I can't think of any language where it's good practice to import/require etc anywhere but at the head of a module/class.
It's not just imports, you can forward reference any declaration in JavaScript. That's a feature that cannot easily be removed, and binding analysis has to deal with it. It's not a problem, though, only a mild complication.
- I didn't explain this part well. I meant for the "@global" import to be a special one who's bindings will not go through the same checks that other modules do. In a browser importing global would return window.
This should make it trivial to upgrade current code to ES6 modules while still allowing much stricter checks in modules. You would use tools that take the ES5 code and scan it for free identifiers and import them from "@global". This then allows you to gradually move them all into ES6 modules.
import * as global, {Date, XMLHttpRequest} from "@global";
All bindings provided by "@global" bypass the normal bindings checks.
IIUC, your suggestion is that modules should be closed, i.e., not be able to refer to any non-local or pre-defined identifiers. This idea has a long history in works on module systems, but I'm not aware of any practical language that ever managed to follow it through. It quickly becomes very tedious in practice, and I highly doubt it would jive well with JavaScript, or probably any language (although Gilad Bracha disagrees on the latter).
Thanks for the response and explanations.
I didn't realize how limited in power these fast parsers actually were, they are basically lexers. So yes this would require more bookkeeping on their part and that would impact performance.
I'm doubtful that it would have a significant user perceivable effect though. I imagine modern browser engines perform a lot of work in parallel where they can. It would seem that the network would be far slower then the engine parsing JS.
None the less I understand the concerns.
One way around having an impact on current workloads is to only parse in this fashion for modules. This should mean no impact on existing code bases and the impact will only be felt with native modules. Those would rely on HTTP2, Server Push, Resource Hinting so they will tend toward smaller files. If you are shipping native modules then you couldn't bundle into a large module.
As these modules would be stand alone fast parsing should be embarrassingly parallel. Lets be optimistic and say that in 5 years developers will be able to roll out native modules. I don't think that in 5 years a shortage of cores will be a problem, even on mobile devices. Given how slowly older IEs are being replaced on enterprise desktops it maybe 7 years...
This is all conjecture on my part of course...
Yes hoisting is another complication, more bookkeeping, it will probably delay when an error can be raised. But would you not have to deal with it anyway? Can you not export a hoisted function? The module system has to do binding checks on what it exports so keeping track of hoisted functions might have to happen anyway.
Modules would end up being slower to parse then scripts but modules are for large scale development. I'm not sure in the grand scheme of things it will be that relevant though. Especially when weighted against increased static reasoning of modules.
Fully closed modules are as you said are probably too tedious - that can be dropped. It's more about making modules closed against user defined state as opposed to system defined state. I think that will flow very well with the module system.
import * as global, {myGlobalFunction} from "@global";
Should be enough to allow easy upgrading from ES5 scripts while still allowing the module system/tooling to provide more static guarantees.
On 8 October 2014 14:11, Brian Di Palma <offler at gmail.com> wrote:
I didn't realize how limited in power these fast parsers actually were, they are basically lexers.
No, that's not correct. They have to perform a full syntax check. That does not imply binding analysis, though, which is usually regarded part of the static semantics of a language, not its (context-free) syntax. (More by accident than by design, JavaScript so far didn't have much of a static semantics -- at least none that would rule out many programs, i.e., induce compile-time errors. Hence engines could get away with lazy compilation so well.)
I'm doubtful that it would have a significant user perceivable effect though. I imagine modern browser engines perform a lot of work in parallel where they can.
Unfortunately, parallelism doesn't help here, since this is all about the initial parse (of every source), which has to happen before anything else, and so directly affects start-up times.
One way around having an impact on current workloads is to only parse in this fashion for modules.
Yes, see the earlier posts by Dave an myself. Didn't happen, though.
As these modules would be stand alone fast parsing should be embarrassingly parallel.
You can indeed parallelise parsing and checking of separate modules, but each individual task would still take longer, so there would still have been a potential overall cost.
Yes hoisting is another complication, more bookkeeping, it will probably delay when an error can be raised. But would you not have to deal with it anyway? Can you not export a hoisted function?
Yes, as I said, recursive scoping (a.k.a. "hoisting") is neither a new nor a significant problem.
Fully closed modules are as you said are probably too tedious - that can be dropped. It's more about making modules closed against user defined state as opposed to system defined state.
Yes, but unfortunately, you cannot distinguish between the two in JavaScript -- globals, monkey patching, and all that lovely stuff.
Yes, but unfortunately, you cannot distinguish between the two in JavaScript -- globals, monkey patching, and all that lovely stuff.
And polyfilling - exactly.
On Wed, Oct 8, 2014 at 1:52 PM, Andreas Rossberg <rossberg at google.com> wrote:
No, that's not correct. They have to perform a full syntax check. That does not imply binding analysis, though, which is usually regarded part of the static semantics of a language, not its (context-free) syntax. (More by accident than by design, JavaScript so far didn't have much of a static semantics -- at least none that would rule out many programs, i.e., induce compile-time errors. Hence engines could get away with lazy compilation so well.)
Right, I see.
Yes, see the earlier posts by Dave an myself. Didn't happen, though.
This seems an implementation detail of the engine, especially as you can't introduce checking to scripts.
You can indeed parallelise parsing and checking of separate modules, but each individual task would still take longer, so there would still have been a potential overall cost.
There will be a cost but it didn't sound like that was the reason why checking was taken off the table. I thought it was more around the need to still have access to the global scope in backward compatible code.
Yes, but unfortunately, you cannot distinguish between the two in JavaScript -- globals, monkey patching, and all that lovely stuff.
The way the global import would work would be more akin to a filter.
When a developer uses the global import they are saying "allow these identifiers from the global state to be available in my module".
In the current approach there is a filter, it's just implicit and open to all identifiers. This change is about making it explicit and configurable.
When a developer writes
import {myGlobalFunction, MyPolyfilledConstructor} from "@global";
she creates a binding to these identifiers from the systems global scope.
Unlike other bindings they will not be checked for existence. All the system will do is provide them if they exist, change them if they are changed and introduce them to the module scope if they are created at a later stage.
The developer has no guarantees about these specific bindings. Just like any other global binding, it's purely a declaration mechanism. You are free to access global state, just declare it up front.
Therefore even monkey patched/poly filled code will be allowed through. Any binding what so ever can be allowed through.
I would hope that in future that new constructors will be added to standard modules though...
last time we discussed this, the conclusion was that Reflect.global
is the way to go. more details here:
gist.github.com/ericf/a7b40bf8324cd1f5dc73#how-do-we-access-the-global-object-within-a-module
once realms landed, it will reflect realm.global
.
A compiler can convert references like these
import {myGlobalFunction, MyPolyfilledConstructor} from "@global";
myGlobalFunction(42);
into
global.myGlobalFunction(42);
And the global scope can be window for a browser.
This code has the exact same semantics as if the global scope was allowed fully into the module.
The only difference is that the developer is being asked to explicitly list the bindings that should be allowed filter through.
On Wed, Oct 8, 2014 at 2:51 PM, caridy <caridy at gmail.com> wrote:
last time we discussed this, the conclusion was that
Reflect.global
is the way to go. more details here: gist.github.com/ericf/a7b40bf8324cd1f5dc73#how-do-we-access-the-global-object-within-a-moduleonce realms landed, it will reflect
realm.global
.
I guess
import {myGlobalFunction, MyPolyfilledConstructor} from Reflect.global;
then?
var myGlobalFunction = Reflect.global.myGlobalFunction;
note: you can't use import for global due to the nature of the binding process when importing members or namespaces.
On Wed, Oct 8, 2014 at 3:21 PM, caridy <caridy at gmail.com> wrote: var myGlobalFunction = Reflect.global.myGlobalFunction;
note: you can't use import for global due to the nature of the binding process when importing members or namespaces.
I find
import global from "@global";
global.myGlobalFunction(42);
more readable.
If we can import module meta information why not import the global object too? The global should always exist so the binding checks will work, it's a module with a default export.
Accessing any global property is via that object.
No need to allow random identifiers floating around modules.
The global could potentially be available via the "this module" meta syntax:
import { global } from this module;
Guys, anything that you import is going to be subject to the binding process, they are immutable values. In the other hand, Reflect.global is just the new global for scripts and modules, plain and simple.
Today, many libraries are relying on new Function()
to artificially access global, and that's wont work with CSP, therefore we need to provide a reliable way to access global from anywhere, not only from modules.
Does that mean that anything that is imported is frozen? You can't add properties to any binding you import?
I've never heard of this restriction before. If that restriction doesn't exist then I'm not sure I see what issue this causes
import { global } from this module;
global in this case is an object and I was under the impression that you could add and remove properties to an object if you imported one.
Providing an idiomatic way for modules to access the global seems good ergonomics.
Brian, my point is that using import to get access to global
(as you suggested) is confusing due to the nature of the import, remember the contract: to import something, someone else has to export it first :)
Aside from that, we don't want to have a exclusive way of accessing globals for modules, they should be accessible (as reflective) from everywhere.
On Wed, Oct 8, 2014 at 4:38 PM, caridy <caridy at gmail.com> wrote: Brian, my point is that using import to get access to
global
(as you suggested) is confusing due to the nature of the import, remember the contract: to import something, someone else has to export it first :)
The JS environment exports the global object. I don't think many developers will find code like that confusing and it's easy to learn and understand. It's just as complex as Reflect.global.
Aside from that, we don't want to have a exclusive way of accessing globals for modules, they should be accessible (as reflective) from everywhere.
It doesn't preclude Reflect.global, it's simply an idiomatic way to access the global in modules.
I foresee a lot of
const global = Reflect.global;
at the top of modules in years to come.
If the desire was there importing the global would also allow static errors on free variables.
what do you mean by:
If the desire was there importing the global would also allow static errors on free variables.
If we ever decide to allow importing a reference to global from a module, it has to be a named import, and therefore, no static analysis can be applied to its members (since it is a shared, mutable object), what's the benefit then? ergonomic? I don't think so.
Sorry, I meant free variables inside the module.
The reason why I originally started this thread was to ask if importing global references or the global would be a way of adding static checks of free variables in modules.
You are right that you can't check properties of the global.
Hence the suggestions to either import the global and require module code to prepend "global" before accessing global properties.
import global from "@global";
global.myGlobalFunction(); //may not exist just like normal global code.
In this case only "global" is checked for existence so this should not require special logic in the module system.
The other suggestion was to process an import global as a special module which does not have any binding checks performed on it.
import {myGlobalFunction, MyPolyfilledConstructor} from "@global";
myGlobalFunction(); //may not exist just like normal global code.
Either of the two approaches allows an easy upgrade path for ES5 code to ES6 modules and should allow the introduction of a ban on free variables in modules.
I was curious if a ban on free variables in modules was worth considering and any of the approaches outlined above were feasible.
Just use JSHint.
Would these guarantees not enable more features to be added to the language?
I mean if they are worthless guarantees then fair enough it's not worth considering.
The recent thread on throwing errors on mutating immutable bindings touched upon the fact that there is no static unresolvable reference rejection in modules. I was wondering if that was down to needing to allow access to properties set on the global object?
If that's the case why could you not just import the global into a module scope? Just like you import module meta data the module system could have a way of providing you the global object if you import it. That would mean that any reference that is accessed via the global import can't be checked but every other one can.
Something like
import * as global from "@global"; const myValue = global.myGlobalFunction(); //Special won't be checked can be mutated by everyone, big red flag in code reviews. function test() { x = 10; //Static error, we know this is not an access to a global x as it must be accessed via global. }
Could this work?