Inline ES Modules

# Sultan (a month ago)

Are there any open proposals/discussions related to creating ES modules inline? For example:

import getPersonType from School

module School {
  export function getPersonType (person) {
  switch (person) {
  case 'Teacher': return 'A teacher'
  case 'Director': return 'A director'
  }
  }
}
# Mike Samuel (a month ago)

How would an inline module be imported? Module descriptors are roughly relative URLs so can refer to a JavaScript source file, but it sounds like you'd need something more fine-grained to refer to an inline module. Using fragments to refer to a passage within a document instead of a location might have unintended effects.

Also, assuming that problem is solved, does the below mean anything if (Math.random() < 0.5) { module School { export function getPersonType() {} } }

If not, if inline modules are defined eagerly, what advantages, besides making life easier for transpiler writers, would inline modules have over exporting frozen namespaces?

# Darien Valentine (a month ago)
import getPersonType from 'data:text/javascript,\
  export default function getPersonType(person) {\
    switch (person) {\
      case \'Teacher\': return \'A teacher\';\
      case \'Director\': return \'A director\';\
    }\
  }';

okay, not a serious suggestion, but it does technically work :)

# Peter van der Zee (a month ago)

I actually quite like the idea.

  • Extend the import syntax to allow an identifier instead of a string. Such identifier must be match the name of a module declaration the same file (they are hoisted and a syntax error if not present/something else).
  • Module declaration names are abstracts since they are "exposed" in the scope through an import.
  • Module declarations only allowed on the global level (like import/export declarations)
  • Maybe in the future modules could refer to their name identifier to access meta data.
  • Module bodies are for all intentions and purposes treated as if they were independent js module files
  • Module identifiers are hoisted

Bonus points for making the end token easier to scan for (realistically speaking I'm pretty sure a regular block is preferred). This could help browsers parse large bundle files by fast scanning past module blocks.

import foo from bar;
module bar {#
  log('I'm a module!');
#}

The downside to inline modules is that I'm not sure whether this has more real use beyond webpack/metro/rollup/etc bundlers that put all modules in one bundle file. However, that might still help js envs in some way.

This kind of thing wouldn't need to be a huge tax on the spec by reusing existing semantics.

# Isiah Meadows (a month ago)

Few thoughts:

  1. Almost all of my actual use cases for inline modules have been solvable via just creating a new file.
  2. If you need a file for a bunch of constants, that's not really a problem, and in my experience is more maintainable.
  3. If you need to group those constants, you have a few options: either prefix the constants, put them in objects, or put them in another file by themselves.
  4. If you really need modules, conditionally creating namespaces for them, you should really be considering using objects instead. If they're complex enough, you can put them in another file and export a factory returning what you need.
  5. This idea will be of zero assistance to bundlers supporting ES6, who already know to rename variables to retain proper encapsulation without creating new closures. (The only way you can observe it is with the presence of eval, and a bundler could warn about that if necessary.) They also know that module namespace objects are basically frozen objects whose entries happen to generally have descriptors with [[Writable]]: true (a mild lie). It doesn't affect transpiler writers beyond that of having to process the new syntax.

Isiah Meadows me at isiahmeadows.com, www.isiahmeadows.com

# Andrea Giammarchi (a month ago)

we need to go deeper ...

(async () => {

const {getPersonType} = await import(`data:application/javascript,
export function getPersonType(person) {
  switch (person) {
    case 'Teacher': return 'A teacher';
    case 'Director': return 'A director';
  }
}
`)

console.log(
  getPersonType('Teacher')
);

})();

💩

# Jamie (a month ago)

I think having an inline module format would help make the composition of build tools much easier.

Right now we have an ecosystem where everyone builds libraries into bundles using tools like Parcel, Rollup, Browserify, and Webpack. These create output which get recursively pulled into similar bundles.

The problem is that the operation of merging files into bundles destroys static information we have with individual files. So optimizing compilers have difficulty doing code elimination (often called "tree shaking" in the community) across bundles.

If we had a definition format for these bundlers to compile to which didn't destroy all the static information we have in separate files, we could build better optimizing compilers around them.

I would hope that such a format would also allow engines to optimize the parsing of these inline modules (deferring most of the work until modules are actually loaded).

There is other efforts in a similar space such was the webpackage format: WICG/webpackage/blob/master/explainer.md

# Mike Samuel (a month ago)

On Tue, Jun 19, 2018 at 3:48 PM Jamie <me at thejameskyle.com> wrote:

I think having an inline module format would help make the composition of build tools much easier.

Right now we have an ecosystem where everyone builds libraries into bundles using tools like Parcel, Rollup, Browserify, and Webpack. These create output which get recursively pulled into similar bundles.

The problem is that the operation of merging files into bundles destroys static information we have with individual files. So optimizing compilers have difficulty doing code elimination (often called "tree shaking" in the community) across bundles.

If we had a definition format for these bundlers to compile to which didn't destroy all the static information we have in separate files, we could build better optimizing compilers around them.

I would hope that such a format would also allow engines to optimize the parsing of these inline modules (deferring most of the work until modules are actually loaded).

There is other efforts in a similar space such was the webpackage format: WICG/webpackage/blob/master/explainer.md

What benefits might an inline module proposal have over/in-conjunction-with the webpackage format proposal?

# Andrea Giammarchi (a month ago)

beside the joke, I think there's already everything we need to bundle modules, either synchronously, or asynchronously within a top level async execution.

This is an example:

const esm = (realm =>
  js => realm.get(js = esm.cache[js]) ||
        realm.set(js, import(
          /^(?:.|\/|[a-z]+:\/\/)/.test(js) ?
            js :
            'data:application/javascript,' + js
          ))
          .get(js)
)(new Map);
esm.cache = Object.create(null);

// define your modules inline (or via paths/urls)
esm.cache['rand'] = `
export const uid = (size = 8) => {
  const arr = new Uint8Array(size);
  crypto.getRandomValues(arr);
  return [...arr].map(i => i.toString(16)).join('');
};
`;

// execute with ease on a top level async
(async () => {

  // instead of
  // import {uid} from 'rand';
  const {uid} = await esm('rand');

  document.body.textContent = uid();

})();

It works already live: codepen.io/WebReflection/pen/xzYOWp?editors=0010

It enables the following:

  • bundle applications as single file (not only web)
  • create a similar CommonJS bundle preserving all ESM semantics
  • pre tree-shake and minifiy those modules too, so bundle stand alone modules after tree shaking

I think these are all nice to have features, but I also think we have all the primitives we need to make it happen via pre-processing.

# Andrea Giammarchi (a month ago)

I think these are all nice to have features, but I also think we have all the primitives we need to make it happen via pre-processing.

I meant even synchronously, with a pre-processor that replace the imported module with

'data:text/javascript,' + JSON.stringify(moduleContentAfterTreeShaking);

# Andrea Giammarchi (a month ago)

sorry, I meant:

JSON.stringify('data:text/javascript,' + moduleContentAfterTreeShaking);

# Darien Valentine (a month ago)

Andrea: That is a really interesting approach. I would point out that using data URIs for js means the data: scheme has to be a permitted source for script-src. This allowance has roughly the same security implications as permitting unsafe-eval. I know most people aren’t using CSPs yet, but personally I’d be wary of a solution that makes it harder to adopt a strong CSP.

A super dorky/tangential aside that probably doesn’t matter at all but ... I notice you used application/javascript for the media type. There’s a contradiction between the IANA media type registry and the HTML 5 spec with regard to the "correct" media type to use for JS. RFC 4329 says application/javascript is the only value that should be used, while HTML says text/javascript is the only value that should be used. I believe (not sure though) that this is because it’s the most backwards-compatible value. Given that the media types registry seems to be basically dead to web standards (if we follow the registry we can’t serve or post a bunch of media types acknowledged or defined by web standards at all, including image/webp, application/csp-report, etc) and the code has to run in a browser, I’d tend to think HTML is the better spec to follow ... though I guess when two specs contradict each other it’s hard to make an objective case for one being more authoritative. (I’d be curious if there’s a specific reason you know of to prefer to RFC definition though. When standards don’t align it breaks my tiny heart.)

# Mike Samuel (a month ago)

CSP with data URIs is possible via nonce sources. For data: module descriptors browsers could safely skip the CSP check since it doesn't allow XSS unless one can already specify an import statement which typically means one can specify arbitrary JS. That argument doesn't extend to the import operator though so you'd have to tolerate assymetry there.

# Darien Valentine (a month ago)

Mike: Ah, cool, I didn’t realize that — I had thought that nonces were just for whitelisting inline script elements. How does one specify a nonce in association with a data URI? I’m having trouble turning up a description / example of how it would work. Or possibly I’m misunderstanding this quite a bit, hm ... I’m also confused by the relationship between import/import() and XSS that you’ve described.

# Mike Samuel (a month ago)

Sorry for the confusion.

Nonces are just for elements with url attributes.

I mentioned nonces to point out that strict CSP policies can allow some data: urls without having to explicitly whitelist or hash the entire content.

Separately I wanted to say that there is no incompatibility between the goals of CSP and import statements that use a data: module specifier, since we already trust the compilation unit and there's no actual network message leaked.

But there is a risk with the import operator since it's input is not part of an already trusted input.

# Darien Valentine (a month ago)

Aha! Thanks. I think I get what you mean now.

Let’s say I have this CSP:

content-security-policy: script-src 'nonce-foo'

And I have this in my document:

<script nonce=foo type=module>
  import 'data:text/javascript,console.log(`bar`)';
</script>

Then the browser could theoretically ignore the absence of 'data:' in the CSP safely because the import statement here is part of a nonce-allowed script. And the unsafety is adding data: to the CSP (which would then be available for third party scripts I might also allow), not using data: in my own trusted modules; and there is a minor bit of unsafety associated with dynamic import, but it’s not in the same league as the unsafety potentially implied by a blanket permission for all data: URI sources.

I was surprised that this nuance was considered. I figured it just blindly asked "is this source permitted by the CSP" without taking into account whether the trust from a parent resource could be implicitly extended. But I see you’re totally right:

<!DOCTYPE html>
<meta http-equiv=content-security-policy content="script-src 'nonce-foo'">
<script nonce=foo type=module>
  import 'data:text/javascript,document.open();document.writeln(`<p>static import of data URI module worked</p>`)';
  document.writeln(`<p>nonce module worked</p>`);
  import('data:text/javascript,document.writeln(`<p>dynamic import of data URI module worked</p>`)');
</script>

demo: necessary-hallway.glitch.me

All three seem to work! Very cool.

Sorry for the diversion from the main topic. This was really interesting and I appreciate the explanation.

# Andrea Giammarchi (a month ago)

Darien others replied about CSP but also script-src data: unsafe-inline would work as well. There is no evaluation, or at least, nothing different from loading static content there (OK, the dynamic import could be a different story, yet if pre-processed I don't see it as useful, but surely one day someone will prove me wrong ^_^;; ).

About IANA vs HTML, I think both should be supported, specially because IANA states text/javascript is deprecated and most developers serve JS via application/javascript, or JSON as application/json (and there is no text/json).

WebKit / Safari handles both cases without issues, Chrome chokes on the application/javascript but it's IMO a non sense to allow one but not the other, so I've filed a bug: bugs.chromium.org/p/chromium/issues/detail?id=854370

# Mike Samuel (a month ago)

On Tue, Jun 19, 2018 at 9:50 PM Darien Valentine <valentinium at gmail.com>

wrote:

Aha! Thanks. I think I get what you mean now.

Let’s say I have this CSP:

content-security-policy: script-src 'nonce-foo'

And I have this in my document:

<script nonce=foo type=module>
  import 'data:text/javascript,console.log(`bar`)';
</script>

Then the browser could theoretically ignore the absence of 'data:' in the CSP safely because the import statement here is part of a nonce-allowed script. And the unsafety is adding data: to the CSP (which would then be available for third party scripts I might also allow), not using data: in my own trusted modules; and there is a minor bit of unsafety associated with dynamic import, but it’s not in the same league as the unsafety potentially implied by a blanket permission for all data: URI sources.

I was surprised that this nuance was considered. I figured it just blindly asked "is this source permitted by the CSP" without taking into account whether the trust from a parent resource could be implicitly extended. But I see you’re totally right:

<!DOCTYPE html>
<meta http-equiv=content-security-policy content="script-src 'nonce-foo'">
<script nonce=foo type=module>
  import 'data:text/javascript,document.open();document.writeln(`<p>static
import of data URI module worked</p>`)';
  document.writeln(`<p>nonce module worked</p>`);
  import('data:text/javascript,document.writeln(`<p>dynamic import of data
URI module worked</p>`)');
</script>

demo: necessary-hallway.glitch.me

All three seem to work! Very cool.

w3c/webappsec-csp#243 : "Any protection against dynamic module import?" captures the discussion on this.

# Sultan (a month ago)

They would act akin to hoisted functions declarations in that regard, For example

import {getPersonType} from School

if (Math.random() < 0.5) {
  module School {
    export function getPersonType() {}
  }
}

Largely yes, the utility is in providing bundlers and authors with a encapsulated order-independent "concat-able" standard format to output to, considering the hurdles presented with the "waterfall of requests" problem that can afflict current native ES modules.

Additionally there are aspects that bundlers have a hard time replicating when using ES modules as an authoring format. Consider the following example, where ES modules might maintain a "live" binding.

// a.js
import {b} from './b.js'

setTimeout(() => console.log(b), 400)

// b.js
export var b = 1

setTimeout(() => b++, 200)

A bundler on the other hand might be forced to produce static bindings.

var $b1 = 1

setTimeout(() => $b1++, 200)

var $b2 = $b1

setTimeout(() => console.log($b1), 400)
# Mike Samuel (a month ago)

On Wed, Jun 20, 2018 at 10:44 AM Sultan <thysultan at gmail.com> wrote:

Additionally there are aspects that bundlers have a hard time replicating when using ES modules as an authoring format. Consider the following example, where ES modules might maintain a "live" binding.

// a.js
import {b} from './b.js'

setTimeout(() => console.log(b), 400)

// b.js
export var b = 1

setTimeout(() => b++, 200)

A bundler on the other hand might be forced to produce static bindings.

var $b1 = 1

setTimeout(() => $b1++, 200)

var $b2 = $b1

setTimeout(() => console.log($b1), 400)

Or recognize bindings that might be reassigned and use the mangled export binding directly instead of introducing a local for the import bindings:

var $b1 = 1
setTimeout(() => $b1++, 200)

setTimeout(() => console.log($b1), 400)

Or allocate a cell for reassignable bindings:

var $b1 = [1]
setTimeout(() => $b1[0]++, 200)

var $b2 = $b1
setTimeout(() => console.log($b2[0]), 400)

No?

Maybe I'm treading on "sufficiently smart transpiler" territory but it seems to me that live bindings can be handled simply with a bit of overhead that can be often eliminated in the common case with only local analysis.

And to the degree that this is a problem, it's a problem as long as there's a gap between inline module support becoming available and bundlers end-of-lifing support for previous versions of EcmaScript as an output language option.

Unless I'm missing something, inline modules are unnecessary for live bindings and insufficient given the need to support older versions as output languages for at least some time.


Did you address my question about importing inline modules? If so, I must've missed it.


They would act akin to hoisted functions declarations in that regard,

I'm also unclear what function hoisting has to do with module declarations inside loops or conditionals if that's allowed.