Remarks about module import
On Mon, Aug 18, 2008 at 1:44 PM, <ihab.awad at gmail.com> wrote:
Consider a module file located at the URL --
containing the text --
var isOn = false; provide toggle: function() { document.setBackgroundColor(isOn ? background : '#ffffff'); isOn = !isOn; }, set: function() { document.setBackgroundColor(background); isOn = true; };
This code has two free variables, "document" and "background", and returns the symbols "toggle" and "set" to the importer.
I'll note that JavaScript seems strangely in between being an expression language and a statement language. For most purposes, statements are evaluated only for effect, not value. However, a program evaled as a string yields a value as one would expect of an expression language. In squarefree on FF:
eval('for(var i = 0; i < 1; i++){if (true){i+1}}')
1
It looks like the relevant part of ES3 is 15.1.2.1 step 3: "Evaluate the program from step 2."
If we decide that a module body is to be treated as an evaled program, then Ihab's example above can be simply:
var isOn = false; { toggle: function() { document.setBackgroundColor(isOn ? background : '#ffffff'); isOn = !isOn; }, set: function() { document.setBackgroundColor(background); isOn = true; } }
On Aug 18, 2008, at 1:44 PM, ihab.awad at gmail.com wrote:
Hi Ihab, I'm going to respond pointedly to only part of your post.
Please take this as a constructive riposte, intended to get at a high-
order design bit that should not be assumed set or clear: whether
modules should be like ES1-3's weak notion of program units or should
be something new: purely lexical scope containers.
By that argument, I would like to propose that a module and a "compilation unit" (or ES3 <Program>) be merged into one concept. This has the useful side effect that the loading of a top-level program (whatever that is, and wherever it comes from) can be re-explained simply in terms of the loading of a module.
This reasoning seems backwards. If modules are added in a first-class
way to the JS, they present an opportunity to avoid the global objec
utterly, and enforce true lexical scope. Since the Program
nonterminal from ES1-3 is evaluated using a shared global object in
browser embeddings, it cannot be restated using this lexical-scope-
only idea of a module.
Why make so future-hostile a decision to enshrine today's Program
semantics with respect to the global object? The ever-extensible
global object topic has been bruted about as a security problem, and
I agree it's a problem for enforcing various properties important for
modularity and security.
Dave Herman points out that JS is evolving away from its REPL roots,
just as Scheme did, so the ever-extensible top level is increasingly
hopeless. One (compatible) way around this is to extend the language
with new forms that introduce purely lexical scope.
Thus the global object, programs as source evaluated using the global
object as the single object on the scope chain, and even all the
hated dynamic scope forms (with, eval that can introduce bindings in
its caller's scope) can be left in the outside-the-module penalty
box. Compatibility requires this, in any foreseeable ES version.
Inside a module, introduced by some kind of explicit syntax, only
lexical scope is allowed. No global properties are available. Typos
in unqualified identifier expressions can be caught at compile time.
Cats and dogs live together in peace ;-).
Comments?
Mark S. Miller wrote:
On Mon, Aug 18, 2008 at 1:44 PM, <ihab.awad at gmail.com> wrote:
Consider a module file located at the URL --
containing the text --
var isOn = false; provide toggle: function() { document.setBackgroundColor(isOn ? background : '#ffffff'); isOn = !isOn; }, set: function() { document.setBackgroundColor(background); isOn = true; };
This code has two free variables, "document" and "background", and returns the symbols "toggle" and "set" to the importer.
I really like the general approach and the simplicity of Ihab's proposal. Also I strongly agree that a module should not implicitly capture the lexical scope in which it is imported.
I'm not sure why 'provide' needs new syntax, though. What's wrong with
module(function() { var isOn = false; return { toggle: ..., set: ... }; });
or
var isOn = false; provide({ toggle: ..., set: ... });
?
I'll note that JavaScript seems strangely in between being an expression language and a statement language. For most purposes, statements are evaluated only for effect, not value. However, a program evaled as a string yields a value as one would expect of an expression language. [...] If we decide that a module body is to be treated as an evaled program, then Ihab's example above can be simply: [...]
The fact that eval does things differently to the fragment of the language without eval, doesn't mean that it is a good idea to rely on those differences when designing new language features.
On Mon, Aug 18, 2008 at 4:46 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Aug 18, 2008, at 1:44 PM, ihab.awad at gmail.com wrote: .... whether modules should be like ES1-3's weak notion of program units or should be something new: purely lexical scope containers.
I was making a weak reference ;) to the ES3 <Program> but picking and
choosing. Plus my head is deep in the Caja world. Apologies for the lack of clarity.
My proposal is the latter per your question: A module is an isolated lexical scope container, and contains no implicit references to any shared global object.
This reasoning seems backwards. If modules are added in a first-class way to the JS, they present an opportunity to avoid the global objec utterly, and enforce true lexical scope. Since the Program nonterminal from ES1-3 is evaluated using a shared global object in browser embeddings, it cannot be restated using this lexical-scope-only idea of a module.
Right, so yes, I agree entirely.
Inside a module, introduced by some kind of explicit syntax, only lexical scope is allowed. No global properties are available. Typos in unqualified identifier expressions can be caught at compile time. Cats and dogs live together in peace ;-).
Yup, live they do. :) That said, we in Caja land have worried about whether some default global properties should be made available -- objects that are essentially powerless like Function and Number, and which would be a pain in the neck to have to explicitly pass down to subordinate modules every time with every "importModule" statement. What this set of defaults would or should be is up in the air, mainly because Date would logically be one of them yet it is notably not powerless.
Thanks!!
Ihab
On Aug 18, 2008, at 4:55 PM, David-Sarah Hopwood wrote:
I really like the general approach and the simplicity of Ihab's
proposal. Also I strongly agree that a module should not implicitly capture the lexical scope in which it is imported.
I don't think anyone proposed any such thing. Do you?
I'm not sure why 'provide' needs new syntax, though.
Syntax is (a) often good UI; (b) special form expression where
there's no "library" way to say what the special form says.
Why should everything be lambda-coded?
What if I change your bindings for module and provide? (Maybe I
can't; please explain why not.)
I'm not being snarky (or not merely ;-). The pre-Harmony extreme of
"no new syntax, ever" is dead. Asking whether new syntax pays for
itself is ok, but the question becomes vacuous when the only
demonstration against new syntax begs questions about usability and
integrity.
On Aug 18, 2008, at 5:02 PM, ihab.awad at gmail.com wrote:
That said, we in Caja land have worried about whether some default global properties should be made available -- objects that are essentially powerless like Function and Number, and which would be a pain in the neck to have to explicitly pass down to subordinate modules every time with every "importModule" statement.
Why couldn't they be imported from a standard module? Must everything
be passed down? Maybe in Caja, but I don't see that as a requirement
on successor ES standards.
On Mon, Aug 18, 2008 at 5:17 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Aug 18, 2008, at 5:02 PM, ihab.awad at gmail.com wrote:
That said, we in Caja land have worried about whether some default global properties should be made available --
Why couldn't they be imported from a standard module? Must everything be passed down? Maybe in Caja, but I don't see that as a requirement on successor ES standards.
Even in Caja, it's possible for one module to import another. What needs to be passed down is authority, not the ability to execute code.
But ok, yes, if people like the idea of using an "importModule" construct to get the "Function" and "Number" objects, say, then sure, that's a great solution.
What it means essentially is that the "importModule" construct, plus some "fetchModule" service that knows where to find the primordial objects, together constitute the material provided to a module by default. If this can be made to work cleanly while allowing user-supplied "fetchModule" implementations that get module code from (say) a database or whatever, then that's peachy.
Ihab
Fwiw --
On Mon, Aug 18, 2008 at 5:13 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Aug 18, 2008, at 4:55 PM, David-Sarah Hopwood wrote:
I'm not sure why 'provide' needs new syntax, though.
Syntax is (a) often good UI; (b) special form expression where there's no "library" way to say what the special form says.
In my proposal, what I'm calling "importModule" needs to be a special form if it is to be able to insert bindings into the lexical scope in which it is called. Everything else can be done with just plain old function calls. Whether or not it should is a whole 'nother ball of wax of a different color.
What if I change your bindings for module and provide? (Maybe I can't; please explain why not.)
Any given instantiation of a module is always vulnerable in diverse and sundry ways to its importer. Whether my importer changed my bindings for the module loading mechanism is the least of my worries if I'm importing all the rest of my channels to the outside world from it.
Ihab
On Aug 18, 2008, at 5:25 PM, ihab.awad at gmail.com wrote:
On Mon, Aug 18, 2008 at 5:17 PM, Brendan Eich <brendan at mozilla.org>
wrote:On Aug 18, 2008, at 5:02 PM, ihab.awad at gmail.com wrote:
That said, we in Caja land have worried about whether some default global properties should be made available --
Why couldn't they be imported from a standard module? Must
everything be passed down? Maybe in Caja, but I don't see that as a requirement on successor ES standards.Even in Caja, it's possible for one module to import another. What needs to be passed down is authority, not the ability to execute code.
I was asking, I'm happy with an answer. But what's the requirement,
exactly? Can you give an example?
But ok, yes, if people like the idea of using an "importModule" construct to get the "Function" and "Number" objects, say, then sure, that's a great solution.
I didn't ask what people like. That's for later ;-).
I thought you suggested that a few built-ins (Function, Number, not
Date) were safe to populate in a global scope, "above" a module's
apparent top lexical scope. A safe (immutable, I hope) top-level. Do
I have that right?
What it means essentially is that the "importModule" construct, plus some "fetchModule" service that knows where to find the primordial objects, together constitute the material provided to a module by default. If this can be made to work cleanly while allowing user-supplied "fetchModule" implementations that get module code from (say) a database or whatever, then that's peachy.
Could be. Dave's sketch was not ready for prime time in his own
words, but one of the first objections to it (which may be perfectly
fair, I don't know) was about modules giving themselves names, used
elsewhere in requires statements.
Your post asserted that responsibility for naming a module belongs to
the importer (requirer? ugh). That could be the whole truth, or half
of it. Provider and requirer might both want to use distinct names.
If requires consulted a different namespace from the property map of
any object (especially of the global object), would that be insecure?
I'm familiar with objcap research, but I still get a strange feeling
around jargon from it (sort of like I'm being inducted into a new
religion). Authority, responsibility, etc. are well-defined in
the literature, maybe (one hopes), but their application here, their
use as short-hand arguments, may be less clear and convincing to the
outsiders than you would hope.
(And however convincing I find the definitions and arguments in the
literature, when I hear the summary judgments, I often feel that I've
been told :-/. I'll cope...)
Could you expand on why it's inevitably insecure to have a module
system with self-named modules accessible in some namespace built up
by special forms such as Dave's module syntax?
On Mon, Aug 18, 2008 at 5:59 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Aug 18, 2008, at 5:25 PM, ihab.awad at gmail.com wrote:
Even in Caja, it's possible for one module to import another. What needs to be passed down is authority, not the ability to execute code.
I was asking, I'm happy with an answer. But what's the requirement, exactly? Can you give an example?
I'll back into this topic, in a way. From MarkM's thesis, page 77, caption to figure 8.2 --
Authority is the Ability to Cause Effects. If 1) Bob has permission to talk to Alice, 2) Alice has permission to write /etc/passwd, and 3) Alice chooses to write there any text Bob asks her to, then Bob has authority to write /etc/passwd.
Loading a module in itself may cause effects (e.g., if the code is retrieved via HTTP). Otherwise, by executing this code, the entity importing the module may only consume memory and CPU time; it gains no new authority that it did not have before. (And, strictly speaking, the authority to load code should properly be granted by the importer of a module.)
This requirement is driven by the desire to eliminate ambient authority: a piece of code should not have abilities to cause effects that have not been explicitly granted to it. For example, in the Unix example, any process has the ambient authority to read a large subset of the files on the system, regardless of whether these are needed for the task at hand. To quote a classic example, when I write:
cp x.txt y.txt
the "cp" process can attempt to read /etc/passwd and, if successful, open a socket and send the contents to the "evil.com" server. Yet it needed only to read "x.txt" and write "y.txt" and, were it so confined, the damage it could cause would be limited to corrupting the contents of "y.txt". This damage is consistent with a simple causal model: "cp" can only damage the things I tell it to work on, and nothing else. And finally, such damage, being narrowly constrained, is now such that there is practically no reasonable economic or other incentive for authoring a malicious "cp".
From MarkM's thesis, page 16:
The Principle of Least Authority (POLA) recommends that you grant each program only the authority it needs to do its job [MS03].
and this, as I hopefully motivated, is why we need to be sure that a module gets authority only explicitly, from its invoker.
A capability is just an object reference that conveys authority. Typically, it is an object that, directly or transitively, can cause effects such as modifying valuable data held by some module; making changes to stable storage; displaying information on an output device; reading an input device; using the network; etc.
A deep vision of capability systems would be that, on a computer system, the ability to cause effects at a primitive level -- essentially, access to hardware devices -- is owned by some powerful module instances which attenuate that into fine-grained objects, each of which represents some reasonable chunk (e.g., creating a connection to a specific host, or writing to a specific rectangle of the screen, or communicating in read-only fashion with a specific USB mass storage device). Careful interaction between modules in the system, keeping POLA in mind, ensures that everyone gets just the authority they need.
The wiring of capabilities can be done by module configuration, but an important source of wiring information is the end-user. A deep capability system exposes to the end-user the ability to divvy up their authority between the modules to which they have access. For example, given a "stock tracking gadget" module, I may create two instances: one to hold my private information about what stocks I own, and another that just displays some information I consider interesting. The first I keep to myself, but I publish the second on my public profile page. Importantly, the fact that these are two instances of the same code does not grant them the ability to communicate; since I have not wired up my private instance to anything in my public profile, I know that my data is safe.
In fact, the classic rewriting of the "cp" example is to simply pass it file descriptors, rather than file names:
cp < x.txt > y.txt
Thus redefined, "cp" needs no ambient access to any files; everything it needs is in its arguments or its Unix file descriptors, and that is all provided by its parent process. POLA is satisfied. (The fact that pretty much all Unices give "cp" the ambient authority to do pretty much anything else, even under these circumstances, is, if one is to take a polemical ocap perspective, simply a bug.)
I thought you suggested that a few built-ins (Function, Number, not Date) were safe to populate in a global scope, "above" a module's apparent top lexical scope. A safe (immutable, I hope) top-level. Do I have that right?
Yes, that was my suggestion.
Your post asserted that responsibility for naming a module belongs to the importer (requirer? ugh). That could be the whole truth, or half of it. Provider and requirer might both want to use distinct names. If requires consulted a different namespace from the property map of any object (especially of the global object), would that be insecure?
It depends on who gets to populate the map; more below....
objcap research ... (sort of like I'm being inducted into a new religion).
Your point? Oh wait, here's comes the hat. Give generously. };->
Could you expand on why it's inevitably insecure to have a module system with self-named modules accessible in some namespace built up by special forms such as Dave's module syntax?
Right, absolutely. The problem is that modules have an incentive to lie about their names. Let's say I rely on a module called "PRand" to generate pseudo-random numbers, and this is important for the integrity of my algorithm. Let's also say I load a number of other modules. I am now vulnerable to any of these other modules registering itself under the name "PRand" and returning the number 1 every time, thus breaking my code. While I am not vulnerable to any module I did not load, I am now equally vulnerable to all the modules I load. Thus (a) it is very difficult for me to reason clearly about the composition of attacks my subordinate modules can mount; and (b) I cannot meaningfully choose to trust one of them more than the others since they can masquerade in this way.
Clearly, I still rely on the component I am calling "fetchModule" to -- well -- fetch the correct module and not lie. And I must rely on my parent to give me a correct "fetchModule". But, by avoiding self-named modules, it is possible for me to build a "fetchModule" that returns the right answer for the things that I care about, and is not vulnerable to the misbehavior of stuff that it cannot control.
Ihab
ihab.awad at gmail.com wrote:
An importer could use this as follows --
var doc = ...; var bg = ...;
import of fetchModule('foo.com/someModule.js'), with document: doc, background: bg using t: toggle, s: set;
From the descriptions it looks like this could instead use a syntax based on destructuring assignment, if es-harmony will have destructuring:
var {toggle: t, set: s} = import
( fetchModule ('http://foo.com/someModule.js'),
{document: doc, background: bg}
);
t();
One advantage would be that people would remember this syntax easily, since it's useful elsewhere.
As a side effect, one could then choose to use the returned object instead:
var m = import
( fetchModule ('http://foo.com/someModule.js'),
{document: doc, background: bg}
);
m.toggle();
On Tue, Aug 19, 2008 at 1:00 AM, Ingvar von Schoultz <ingvar-v-s at comhem.se> wrote:
From the descriptions it looks like this could instead use a syntax based on destructuring assignment, if es-harmony will have destructuring:
I expect es-harmony to have destructuring bind.
var {toggle: t, set: s} = import ( fetchModule ('http://foo.com/someModule.js'), {document: doc, background: bg} ); t();
One advantage would be that people would remember this syntax easily, since it's useful elsewhere.
As a side effect, one could then choose to use the returned object instead:
var m = import ( fetchModule ('http://foo.com/someModule.js'), {document: doc, background: bg} ); m.toggle();
Good suggestions! I like it.
On Tue, Aug 19, 2008 at 6:47 AM, Mark S. Miller <erights at google.com> wrote:
Good suggestions! I like it.
+1
Ihab
Brendan Eich wrote:
On Aug 18, 2008, at 4:55 PM, David-Sarah Hopwood wrote:
I really like the general approach and the simplicity of Ihab's proposal. Also I strongly agree that a module should not implicitly capture the lexical scope in which it is imported.
I don't think anyone proposed any such thing. Do you?
Ihab's post said:
I strongly suggest that the importer supply an explicit map giving
symbol bindings for the loaded module, i.e., I don't think the
imported module should automatically inherit the lexical scope at
the import point. The module's code is typically not under the
direct control of the importer, so allowing it to schlep all its
importer's lexical variables is giving it too much dangerous power.
So I assumed that "automatically inherit[ing] the lexical scope" was something that someone had proposed. (It is very difficult to keep track of what exactly has been proposed, given the disorganised state of the wiki.) If not, fine; then there's no need to discuss it further.
I'm not sure why 'provide' needs new syntax, though.
Syntax is (a) often good UI; (b) special form expression where there's no "library" way to say what the special form says.
I said, and meant, that I'm not sure why 'provide' needs new syntax. I am not making any assertion about new syntax being useless in general.
In this case, there is a library way to express what the special form says -- I gave two concrete examples of how that could be done. a) is only relevant if it can be shown that new syntax provides a signficantly better UI in this case.
Why should everything be lambda-coded?
What I suggested is not lambda-coding. It's just using a function call instead of one specific keyword.
What if I change your bindings for module and provide? (Maybe I can't; please explain why not.)
The problem that some bindings are relied on to the extent that they must not be changed, is one that must be solved independently of any other feature such as a module system. From a security/integrity point of view there's no reason to treat 'module' or 'provide' as being different from, say, 'Array' in this respect.
I'm not being snarky (or not merely ;-). The pre-Harmony extreme of "no new syntax, ever" is dead.
That's a straw-man; I think the position of most of the ES3.1 proponents was simply "minimal new syntax in ES3.1" ('const' is new syntax, for example). Anyway, I'm not interested in rehashing old disagreements.
Asking whether new syntax pays for itself is ok, but the question becomes vacuous when the only demonstration against new syntax begs questions about usability and integrity.
Integrity is addressed above: adding a few more names for which rebinding needs to be prevented makes no difference.
What usability questions? I see no essential difference in usability between "provide({...});" and "provide ...;"
Also I strongly agree that a module should not implicitly capture the lexical scope in which it is imported. I don't think anyone proposed any such thing. Do you? Ihab's post said: ...
Apologies if I caused confusion here -- I was merely trying to state strongly a conceptual desideratum regardless of whether or not it was proposed.
David-Sarah Hopwood wrote:
Brendan Eich wrote:
Asking whether new syntax pays for itself is ok, but the question becomes vacuous when the only demonstration against new syntax begs questions about usability and integrity.
Integrity is addressed above: adding a few more names for which rebinding needs to be prevented makes no difference.
What usability questions? I see no essential difference in usability between "provide({...});" and "provide ...;"
Actually, let me correct that. The former is more expressive, and hence potentially more usable in some situations, because the description of what is being provided can be a first-class value. On the other hand, this increased expressiveness has the effect that the set of provided names is in general not statically known. It's not immediately clear whether or not this is something that is useful to know statically (note that if modules are loaded dynamically, then a hypothetical static analyser would not have enough information to check that 'provides' clauses and import clauses match up in any case).
So, there are some interesting semantic differences that might prompt us to choose one approach over the other. If the distinction had been only in syntax, then I would argue that the presumption should be not to add new syntax unless it has much more compelling advantages than simply avoiding two pairs of brackets.
I'm a bit late to this module party...
On Tue, Aug 19, 2008 at 1:00 AM, Ingvar von Schoultz <ingvar-v-s at comhem.se> wrote:
From the descriptions it looks like this could instead use a syntax based on destructuring assignment, if es-harmony will have destructuring:
var {toggle: t, set: s} = import ( fetchModule ('http://foo.com/someModule.js'),
Are "module" and "file" synonymous? See
dev.helma.org/wiki/Modules+and+Scopes+in+Helma+NG
What do the contents of the foo.com/someModule.js file look like?
fetchModule would need to work with the local file system for server-side work.
What exactly does fetchModule return? Some kind of first-class Module class instance or is it returning the object which is being destructured?
{document: doc, background: bg}
What does this argument to import do?
); t();
[snip]
Peter
On Mon, Aug 18, 2008 at 1:44 PM, <ihab.awad at gmail.com> wrote:
Hi folks, The module system proposals, especially the one here -- proposals:modules
Oh, a module party! Sorry I'm late and thanks to Peter Michaux for alerting me that I was missing out. Ihab, if you recall, I met you and "The Mikes" last December to talk about module systems. I really like the direction of this thread and thought I'd put in a couple cents. A lot of the things I want from a module system have already been mentioned, so some of this is just a reiteration of some of the great ideas that have been posed; some beg distinctions.
We do not need to preserve notions of eval, particularly that the last statement evaluated be the module object. I think that's a great idea and vehicle for explicit exports, but consider this one: perhaps modules can conceptually be constructors for capability objects that are frozen and returned by some "require" function or, by extension, some syntax that ultimately calls said "require" function.
Just to be explicit, I think that, if module A imports module B, module B must have a special scope chain and context object. Solely by virtue of having been imported, we could distinguish it from a legacy script, even if module A isn't a new-style module. The context object would be the module object itself, that you would add attributes to in order to "provide" exports. This would increase the parallelism between objects and modules. The scope chain from global to local would be:
- builtins object
- module scope
- the self module
- an anonymous function block scope
I agree that the builtins object should not be the global window object that we all know and love. It should be a frozen capability object containing JavaScript primitives that can be expected and conveniently accessed in any programming environment: frozen versions of String, Number, Object, (is a Map still in?) &c. Perhaps the browser can host a "window" module that we explicitly import. I think it is also reasonable for this scope to contain some additional primitives including "log" or "print" (since we no longer have to avoid colliding with window.print).
The module scope, essentially analogous to IMPORTS__ in Caja, should contain the "require" function (if this function needs to be unique for each module, since in a lambda-based implementation it would need to implicitly be aware of the URL on which the module resides for module-relative "require" calls). It could also contain any imported names from an "from module import *" style import. This would permit module code to retrieve these values and would also prevent malicious modules from overwriting the client module's inner-workings. However, this might not be a good solution for two remaining concerns: for security, it would not prevent a malicious module from overwriting names imported from those modules imported before it; for verifiability, it would make it more difficult to construct compile-time checks for name errors. In this respect, I recognize a tension and am resigned to the final value judgement.
The module scope could also contain a "module" variable that refers to the current module, plus "moduleScope", and "builtins" as deemed fit. Also, the "moduleUrl", like "FILE" would be handy for introspection.
The module itself could be in the scope chain. This would permit programmers to reference provided functions without explicating "this" or "module".
And, naturally, there would need to be an empty anonymous scope chain for "private" closure variables for the module.
That's the kind of environment I believe JavaScript should wrap around modules when they're loaded. I think it's also important that module's be singleton by virtue of memoizing the ultimate "require" function. I say "ultimate" because this would be the function that requires a module from it's fully qualified URL. Which leads me to my thoughts about the "require" function's calling conventions.
I believe that the "require" function should be a continuation, either implicit or explicit, that yields and blocks until the module has been loaded, or accepts a continuation as an optional argument. I also think that module's should be identified and loaded with URL's. There should be a notion of a "module root", a base URL for script paths. I do not think that we should not support anything like a lookup chain of PATHs, since this would incur major performance problems as the user agent looks and fails to find modules in each successive PATH. There should be one, and it should be global, determined by the user agent, perhaps deferring to a script path defined somewhere in the HTML for browser agents. There should also be module relative paths. This would liberate module's from the names of the directories and domains that contain them, increasing reuse.
The require function might also benefit from accepting a version number, although I think it would suffice to explicate that in the URL much like "/usr/lib/libc.so" usually symlinks to "/usr/lib/libc.so.5" on Unices.
So, I recommend that modules be identified by URL's, although not necessarily Strings since that might compromise static analysis again. I also think that, borrowing a meme from python3k, if a URL begins with a dot, it be module relative. Consider (where "import" stands in for some yet to be determined keyword):
// in my.com/site.html where the moduleRoot is the same
as the page URL by implication: import "window"; // moduleScope.window = require("window"); import "jquery.com/jquery-2.6.js"; // moduleScope.jQuery = require(...); import "./widget.js" as widget; // moduleScope.widget = require("my.com/widget.js"); from "./widget.js" import Widget; // moduleScope.widget = require('my.com/widget.js').Widget;
On the topic of PATH, it occurs to me that a page could potentially subscribe to a module root either hosted by the browser in chrome:// or potentially on a CDN like Google's AJAX modules. That might answer my performance concern from walking the PATH and hitting a 404, extending page load times by a Round-trip-time for each missed module.
This leaves the issue of "bundling". Web page authors will still need to concatenate scripts and CSS to improve a page load's performance. To that end, I recommend that module's have a "provide" or "register" function, wherein they can, in a single module, provide a bundle of module objects that they construct themselves, or declare in the same way that they would in another file.
provide("./widget.js", widgetModule.freeze());
provide "./widget.js" { }
I'll leave it to the Ihab or Mike to comment on the security implications of bundling; I suspect they are dire. Perhaps only modules in subordinate URLs can be provided by one module. That's another tension we should consider.
There was mention on the original wiki page of requiring module dependencies to form a directed acyclic graph (ok, a tree). I don't believe this is any more necessary than in Python, where it's desirable by not enforced. Since module objects are singleton and registered before a module is evaluated, modules have the option of providing their partially completed module objects to cyclic dependencies.
Not having a solid module system is my biggest pain-point in modern JavaScript. Without it, JavaScripters are relegated to using best practices and design patterns to make their scripts more but not quite portable, and more but not quite secure. I've managed to make most of these features possible in user-space JavaScript using a collection of "naughty" practices like gratuitous use of "eval" and "with" and I consider the sacrifice worthwhile, but something similar needs to just be natively available for security and ubiquity. I have great hope for the fruit of this discussion.
Kris Kowal
On Sat, Aug 23, 2008 at 5:41 PM, Kris Kowal <kris.kowal at cixar.com> wrote:
On Mon, Aug 18, 2008 at 1:44 PM, <ihab.awad at gmail.com> wrote:
Hi folks, The module system proposals, especially the one here -- proposals:modules
Oh, a module party! Sorry I'm late and thanks to Peter Michaux for alerting me that I was missing out. Ihab, if you recall, I met you and "The Mikes" last December to talk about module systems. I really like the direction of this thread and thought I'd put in a couple cents. A lot of the things I want from a module system have already been mentioned, so some of this is just a reiteration of some of the great ideas that have been posed; some beg distinctions.
Can you provide concrete examples (something a few lines longer than a hello world module) which shows both the module and importer code?
[snip]
The module scope could also contain a "module" variable that refers to the current module, plus "moduleScope", and "builtins" as deemed fit. Also, the "moduleUrl", like "FILE" would be handy for introspection.
DIR has been even more handy for me so one file can load another one at a relative position in the file system.
[snip]
// in my.com/site.html where the moduleRoot is the same as the page URL by implication: import "window"; // moduleScope.window = require("window"); import "jquery.com/jquery-2.6.js"; // moduleScope.jQuery = require(...); import "./widget.js" as widget; // moduleScope.widget = require("my.com/widget.js"); from "./widget.js" import Widget; // moduleScope.widget = require('my.com/widget.js').Widget;
(< 72 chars/line usually avoids wrapping.)
[snip]
Peter
Peter,
Can you provide concrete examples (something a few lines longer than a hello world module) which shows both the module and importer code?
<sink.js>
/**
this module provides a sink
function which allows the
user to cause a DOM element to forward its events to
one and only one, detachable Widget object that implements
./event.js#Signaler
.
*/
/* these are modules by the same author, in the same directory / from "./urllib.js" import urlJoin; from "./base.js" import Set; / this is a cross-browser compatibility layer / from "./browser.js" import normalizeEventName, browserEventName; / presumably "browser" is a module provided by the browser in some
- cross-browser compatible way. */ from "chrome://js/browser.js" import observe;
/* using let or var makes a variable private to the module / let widgetNs = urlJoin(FILE, '#widget'); // or let widgetNs = urlJoin(moduleUrl, '#widget'); let sinksAttribute = urlJoin(file, '#sinks'); / the name FILE, moduleUrl, file, or DIR
- isn't as important as the behavior. It would not
- be onerous to provide both module file and dir variables,
- but dir can be inferred from file and is best
- dealt with via urlJoin which handles both cases unless
- the provider of DIR is unscrupulous about the
- final forward-slash. */
/* assigning to "this" makes it an export */ this.sink = function (element, widget, eventName) { if (element[widgetNs] && element[widgetNs] != widget) { element[widgetNs].final(); } element[widgetNs] = widget; let sinks = element.getAttribute(sinksAttribute); element[sinksNs] = sinks;
let normalizedEventName = normalizeEventName(eventName);
let browserEventName = browserEventName(eventName);
if (!sinks.has(normalizedEventName)) {
observe(element, browserEventName, function () {
let widget = this.target[widgetNs];
widget.signal(normalizedEventName, this);
});
sinks.insert(normalizedEventName);
}
/* break reference cycles */
widget = undefined;
element = undefined;
};
<index.js>
from "./sink.js" import sink; // or let sink = require("./sink.js").sink; from "jquery.com/dist/jQuery.10.1.js" import jQuery as $ from "./my-widget.js" import MyWidget sink($('#widget'), MyWidget());
Kris Kowal
On Sat, Aug 23, 2008 at 7:16 PM, Kris Kowal <kris.kowal at cixar.com> wrote:
[snip]
/* these are modules by the same author, in the same directory */ from "./urllib.js" import urlJoin;
I don't know if some sugary syntax like above will be used (or the destructuring syntax will win) but if it is then it might be good to be able to write
from "./urllib.js" import urlJoin as uj;
Or even more SQL-like
import urlJoin as uj from "./urllib.js";
The destructuring syntax presented in this thread would allow for this type of renaming during import. I like the destructuring syntax more and it doesn't involve any new keywords.
/* using let or var makes a variable private to the module */
Are "file" and "module" synonymous?
I don't think a file should be equivalent to a module because it should be possible to concatenate and minify many modules into a single file for distribution to build fast websites. I don't think it is a good idea to entrench the need for many HTTP requests directly into the JavaScript language.
[snip]
/* assigning to "this" makes it an export */ this.sink = function (element, widget, eventName) {
What do you call the "this" object? In ES3 it would be "the global object".
I don't really like assigning to "this" is the syntax for making importable symbols. I think something that involves a keyword like "export" or "provides" is more declarative.
[snip]
<index.js>
Next line allows for multiple imports
from "./sink.js" import sink; // or
Next line allows for only one import
let sink = require("./sink.js").sink;
or with the destructuring
let {sink: sink} = require("./sink.js");
or with the shorthand destructuring
let {sink} = require("./sink.js");
[snip]
Peter
Peter Michaux wrote:
I'm a bit late to this module party...
On Tue, Aug 19, 2008 at 1:00 AM, Ingvar von Schoultz <ingvar-v-s at comhem.se> wrote:
From the descriptions it looks like this could instead use a syntax based on destructuring assignment, if es-harmony will have destructuring:
var {toggle: t, set: s} = import ( fetchModule ('http://foo.com/someModule.js'),
Are "module" and "file" synonymous? See
dev.helma.org/wiki/Modules+and+Scopes+in+Helma+NG
What do the contents of the foo.com/someModule.js file look like?
fetchModule would need to work with the local file system for server-side work.
What exactly does fetchModule return? Some kind of first-class Module class instance or is it returning the object which is being destructured?
Some of your questions are answered in the post by Ihab that I was replying to: esdiscuss/2008-August/006915
In short, the name fetchModule() is just a placeholder to sidestep the whole issue of files, downloads etc. That's part of the host environment, and won't be defined in the language standard.
The details of the exact division of tasks between host environment and language [and thus between fetchModule() and import()] were not discussed at all, as I recall.
{document: doc, background: bg}
What does this argument to import do?
It's stuff that the module needs (requires), packed into an object.
But my only role in the discussion was to propose a syntax. Ihab discussed possible semantics and needs of modules, and showed these semantics with a possible import syntax. I proposed that reuse of existing syntax seemed to be enough, based on the needs and semantics that had been discussed.
<ImportedStuff> = import (<Module>, <NeededStuff>);
The argument that you ask about is the <NeededStuff> argument.
The module system proposals, especially the one here --
proposals:modules
are pretty cool; I took the liberty of not following Dave Herman's advice to not pay it much attention. ;) Here are some comments coming from where I sit on the Caja team, and thus with a strong emphasis on security.
== Module naming ==
The declared name of a module, e.g. "A" in --
module A { ... }
is either meaningful or meaningless. ;) If it is meaningful, it means that the module itself gets to specify where it lives in the namespace(s) owned by the importing entity. That is an improper (read: insecure) inversion of responsibility; what a module, upon being imported, is called by its importer is properly none of its business. And if the module name is meaningless, then we don't need it.
By that argument, I would like to propose that a module and a "compilation unit" (or ES3 <Program>) be merged into one concept. This
has the useful side effect that the loading of a top-level program (whatever that is, and wherever it comes from) can be re-explained simply in terms of the loading of a module.
== Exchange of objects ==
I strongly suggest that the importer supply an explicit map giving symbol bindings for the loaded module, , i.e., I don't think the imported module should automatically inherit the lexical scope at the import point. The module's code is typically not under the direct control of the importer, so allowing it to schlep all its importer's lexical variables is giving it too much dangerous power.
The module itself can gain access to the symbols supplied by the caller via some sort of a "require" syntax; that said, our position in the Caja project has been to simply recognize that certain variables in a [Caja] module are free, and expect the importer to provide bindings (and bind them to "undefined" if the importer fails to provide bindings). I think this is simpler than the "require" syntax and meshes nicely with the existing way that symbols (like "document" and "window") are provided to top-level code in ES embeddings.
Finally, the importing code should be in full control of what names are inserted into its own scope; this again is important for security. In fact, this point is important enough that I would leave out any sort of "import *" functionality, relegating to IDEs the task of helping manage explicit import lists.
== An example ==
Consider a module file located at the URL --
foo.com/someModule.js
containing the text --
var isOn = false; provide toggle: function() { document.setBackgroundColor(isOn ? background : '#ffffff'); isOn = !isOn; }, set: function() { document.setBackgroundColor(background); isOn = true; };
This code has two free variables, "document" and "background", and returns the symbols "toggle" and "set" to the importer. The module also has state, implying that it can be instantiated multiple times.
An importer could use this as follows --
var doc = ...; var bg = ...;
import of fetchModule('foo.com/someModule.js'), with document: doc, background: bg using t: toggle, s: set;
// "t" and "s" are now in scope
This would desugar to something like --
var doc = ...; var bg = ...;
var temp___ = importModule( fetchModule('foo.com/someModule.js'), { document: doc, background: bg }); var t = temp___.toggle; var s = temp___.set;
Note that I have used "fetchModule" as a shortcut to say that I am punting on the semantics of how a module is located and the concurrency semantics whereby its text is retrieved (per Brendan's points made earlier).
To exchange symbols without renaming them, we could abbreviate to --
var document = ...; var background = ...;
import of fetchModule('foo.com/someModule.js'), with document, background using toggle, set;
// "toggle" and "set" are now in scope
== Reference to Caja modules ==
We have a preliminary writeup of how we would like to think about module loading in Caja here --
google-caja.googlecode.com/svn/trunk/doc/html/cajaModuleSystem/index.html
Hope this helps.