proposed relationships of Secure EcmaScript, ES3.1, and ES4.

# Mark S. Miller (18 years ago)

At ses:ses Doug Crockford

explains a rationale for a secure variant of EcmaScript, hereafter "ses". I am part of a team working on two such variants, Cajita and Caja (Caja is mentioned on Crock's page. Cajita is a small ADsafe-like subset of Caja). On the first day of the January EcmaScript meeting, Crock gave a presentation javascript.crockford.com/ses.ppt on

the goals for an official ses. By Crock's criteria (which I like), Cajita and ADsafe would be candidates, but Caja would not due to its lack of minimalism. Caja and Cajita are currently defined by groups.google.com/group/google-caja-discuss/web/caja-spec.pdf

As we discussed it, the general sense was that the creation an ses seems like a valuable idea. But on the first day, the process of deciding on an ses did not seem to be a natural extension of the work of the EcmaScript committee. Perhaps this would be a topic for a different committee on another day.

Due to a suggestion of Kris Zyp, the ses discussion revived on the second day. Kris' suggestion (Kris - please correct any inaccuracies) is that ES4 include an sesEval operation (name tbd) that would evaluate its first argument as an ses program in a lexical scope provided by its second argument. The rules of ses would be chosen so that a containing page running ES4 could safely sesEval a script from an untrusted third party, safe in the knowledge that the only effect this sesScript could have on the containing page is according to the objects that the containing page explicitly provides to the script. (The only entry points into those ES4 objects accessible from ses would be according to some whitelisting mechanism, such as Caja and Cajita currently implement.)

At es3.1:secure_eval I see

the intention to include a secure eval() in ES3.1 while (elsewhere) deprecating the ES3 global eval() function. (Note that, regardless of what language the secure eval() evals, I disagree with the proposal on that page to make eval() be a method of strings. But we can argue about the packaging of sesEval() separately.)

Crock and I met last Monday. Crock began the meeting by writing the provocative statement

JSON < ADsafe < Cajita < Caja < ES3 < ES4

on the whiteboard. To a first approximation each language is a proper subset of the languages to its right. However, currently, each of these subset relationships is broken in various ways. While I don't think these subset relationships can ever be fully repaired, I believe they can be made accurate enough for practical purposes. On Monday, Crock and I made good progress reconciling differences between ADsafe and Cajita.

  1. To the degree that we can accommodate it within our other design goals, I propose that ES3.1 evolve to replace ES3 in this inequality, so as to help repair this inequality. I am certainly willing to evolve Caja and Cajita in coordination with any such effort.

  2. I propose that Cajita be considered a candidate for ses.

  3. I propose that ES3.1 and therefore ES4 include an sesEval() (name tbd) that evaluates its first argument as an ses program in a lexical scope provided by its second argument.

  4. To facilitate the safe interaction between ses objects and ES3.1/ES4 objects:

4a) I notice that ES4 already has a notion of a "strict mode" flag. I propose that both ES3.1 and ES4 have behavior conditioned on a "strict mode" flag.

4b) At bugs.ecmascript.org/ticket/276#comment:11 I propose

that, in strict mode, if a function is called as a function (as opposed to calling it as a method, constructor, or reflectively), then its "this" is bound to undefined. In the absence of strict mode, "this" should be bound to the global object for ES3 compatibility.

4c) At the EcmaScript meeting, I proposed that a function called by call, apply, or bind should have "this" bound to the first argument of that call, bind, or apply -- no matter what value that is. In ES3, if the first argument of call or apply is null or undefined, the function's "this" is bound to the global object. (If I recall correctly, there was general agreement with this proposal by itself. If I'm remembering this inaccurately, please correct me.) Although we could also make this difference of behavior conditional on strict mode, I propose that it be unconditional.

Under these proposals,

f.apply(undefined, [a,b]) ==== f(a,b) ==== f.call(undefined, a,b)

so we have the ability to reflectively call functions either as functions (as above) or as methods. This reflective ability would be present in ses as well. However, considering Cajita as a candidate ses reveals a functionality hole: Because Cajita contains "new" but does not allow manipulation of the prototype chain, Cajita has no ability to reflectively call constructors. So...

  1. I propose the addition of Function.prototype.newInstance(argList) such that

    f.newInstance([a,b]) ==== new f(a,b)

5a) I separately propose that new f(a,b) be considered sugar for f.newInstance([a,b]), so that f can override newInstance to distinguish being invoked as a constructor from being invoked by other means.

# Mike Samuel (18 years ago)

Resending after adding myself to es4-discuss.

On 20/02/2008, Mark Miller <erights at gmail.com> wrote:

[+es4-discuss]

On Wed, Feb 20, 2008 at 4:59 PM, Mike Samuel wrote:

On 20/02/2008, Mark Miller wrote: Since a language is commonly defined as the set of strings produced by a particular grammar, is this equivalent to

JSON ⊂ ADsafe ⊂ Cajita ⊂ Caja ⊂ ES3 ⊂ ES4

People who know Unicode are dangerous ;). How did you type that?

Syntactic subsetting is implied but is not the main intent.

or does your inequality imply some semantic relationship such as: The same string evaluated in an expression context produces an "equivalent" result assuming no exceptions thrown. The same string evaluated in a statement context has "equivalent" side-effects assuming no exceptions thrown. ?

We need to consider each individually. Ideally, once relationships are repaired

  1. Any legal JSON text can be evaluated as a program in any of the languages to its right.
  • This had not been the case for JSON / ADsafe because of the severity of the ADsafe blacklist, which Crock has now repaired.
  • This is not the case for JSON / (Cajita or Caja) because of Caja's current prohibition on names ending in double underbar. However, Caja's restriction is an implementation artifact of the need to translate Caja to ES3. Given appropriate changes in ES3.1, we may be able to remove that restriction from Caja/Cajita-on-ES3.1/ES4.
  • Could you explain the difference in Unicode newline rules that prevents JSON from being syntactically a subset of ES*? Thanks.

There's three problems according to my reading of www.ietf.org/rfc/rfc4627.txt but only the first is directly related to syntax:

(1) There are JSON programs that are not valid ES programs. The JSON program [ "\u2028" ] where the unicode escape is replaced with its literal equivalent is valid according to JSON since the set of characters that can appear in a string unescaped is

unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

but ES does not allow codepoint 0x2028 or 0x2029 to appear unescaped in a string since they are newline characters.

(2) There are JSON programs that have the same text as ES programs but different meaning. ES262 says that all format control codepoints, such as 0x200C, should be stripped out of the program in a pre-lex phase. This is not consistently implemented: eval("'\u200c'.length") == 0 on SpiderMonkey, and 1 on most other interpreters JSON does not strip these characters out, so they are treated as significant.

(3) There are JSON programs that can be parsed to ES but that cannot be serialized back to JSON without losing track of where info was lost. JSON does not put any limits on numbers, but ES does. ES will treat 1e1000 as Infinity. Since JSON does not have a value Infinity, it is unclear how to implement toJSON(fromJSON("[1e1000]")).

cheers, mike

  • There is currently a conflict between JSON and ES3 regarding Unicode

Cf characters. I believe this is repaired by ES4. I do not know, but I hope that this is repaired in ES3.1 as well. If so, then Caja and Cajita will inherit this repair.

  • The legal JSON string '{"proto": 3}' cannot be correctly
# Brendan Eich (18 years ago)

There's a lot of implicit context here, some of which may be new to
es4-discuss readers. Also, not everything here is bound to become an
Ecma standard, as noted in mail I sent earlier today (3.1 could be a
TR and should be in the view of some on the TC39 committee). Comments
inline below, but I wanted to clarify, since es4-discuss has mainly
been about es4, not Caja*/SES/etc.

On Feb 20, 2008, at 4:27 PM, Mark S. Miller wrote:

At ses:ses Doug Crockford explains a rationale for a secure variant of EcmaScript, hereafter "ses". I am part of a team working on two such variants, Cajita and Caja (Caja is mentioned on Crock's page. Cajita is a small ADsafe-like subset of Caja). On the first day of the January EcmaScript meeting, Crock gave a presentation javascript.crockford.com/ses.ppt on the goals for an official ses. By Crock's criteria (which I like), Cajita and ADsafe would be candidates, but Caja would not due to its lack of minimalism. Caja and Cajita are currently defined by groups.google.com/group/google-caja-discuss/web/caja-spec.pdf

One piece of missing context: the TC39 committee generally seemed to
agree that developing and judging the winner among candidate "secure
dialects" was both a multi-year mission and beyond the capabilities
of TC39. Academics and others not willing or able to join Ecma would
be needed. We wouldn't be able to judge soundness based only on
formal methods -- we would need to see fairly broad real-world usage
and evaluate successes and failures. And so on.

I'm not saying SES is pie in the sky or out of bounds for es4-discuss
-- far from it. I have hopes for your work, which I said at the last
face to face was in front-runner position, if that can be said at
this early stage -- I think it can, based on bird-nearly-in-hand
simple-minded reasoning -- but I'd like to have other candidates to
evaluate.

I am saying that es4-discuss readers need to have expectations set
about development, deployment, judging, and then standardization,
if we are to follow Doug's beauty contest approach (which I like).

As we discussed it, the general sense was that the creation an ses seems like a valuable idea. But on the first day, the process of deciding on an ses did not seem to be a natural extension of the work of the EcmaScript committee. Perhaps this would be a topic for a different committee on another day.

I said something stronger: it's not the proper job for a committee in
the closed-room, pay-to-play sense, to decide. Parliament hath not
the competence, to borrow from Robert Bolt's play. I suggested a
prize system with mechanized tests and proofs or other ways of
objective judging, plus white/gray hat open-source hack attacks over
a period of time. Just the thought of reps from Adobe, Apple, Google,
Microsoft, Mozilla, Opera, Yahoo! and other fine organizations
meeting to decide the winner leaves me cold.

Due to a suggestion of Kris Zyp, the ses discussion revived on the second day. Kris' suggestion (Kris - please correct any inaccuracies) is that ES4 include an sesEval operation (name tbd) that would evaluate its first argument as an ses program in a lexical scope provided by its second argument. The rules of ses would be chosen so that a containing page running ES4 could safely sesEval a script from an untrusted third party, safe in the knowledge that the only effect this sesScript could have on the containing page is according to the objects that the containing page explicitly provides to the script. (The only entry points into those ES4 objects accessible from ses would be according to some whitelisting mechanism, such as Caja and Cajita currently implement.)

We have experience in Mozilla with GreaseMonkey's injection of
privileged methods into "secure eval" sandboxes. Frankly, it has been
a trail of tears. The hazards are many and hard to see if the outer
language has things like .call/.apply per ES3, extensions such as
getters and setters, etc. A whitelist is obviously better than a
blacklist, but I'm concerned that there are so many hazards, so hard
to see, that you are creating what I called at the meeting a honey- pot for unwary programmers.

In that light, Geoff Garen of Apple suggested the method should be
called "unsafeEval" -- and I agreed!

  1. To the degree that we can accommodate it within our other design goals, I propose that ES3.1 evolve to replace ES3 in this inequality, so as to help repair this inequality. I am certainly willing to evolve Caja and Cajita in coordination with any such effort.

That's great to hear.

  1. I propose that Cajita be considered a candidate for ses.

By whom? We need a better way to judge, not merely good judges; we
need other candidates. There should be no near-term Ecma stamp of
approval that we rush to apply to some candidate, if I'm right about
the need for other people and extensive testing, including deployment.

  1. I propose that ES3.1 and therefore ES4 include an sesEval() (name tbd) that evaluates its first argument as an ses program in a lexical scope provided by its second argument.

We've been over this before, and not first in the es3.1: wiki -- the
original page is

proposals:resurrected_eval, discussion:resurrected_eval

I agree with the comments under "Resolved issues".

Now we could say something about the outer language and the kinds of
objects that could be injected. But now the secure dialect in the
sandbox is spreading its reference monitor or capability system into
the outer language, and that outer language can't be ES3, therefore
it can't be ES4-in-full (which is a superset of ES3, modulo de-facto
standards fixes).

  1. To facilitate the safe interaction between ses objects and ES3.1/ ES4 objects:

4a) I notice that ES4 already has a notion of a "strict mode" flag. I propose that both ES3.1 and ES4 have behavior conditioned on a "strict mode" flag.

We've talked about this on this list and in the wiki a bit. The
current design does not impose runtime changes due to strict mode,
except to eval -- so there's an exception, and it may be that we can
tolerate a few more exceptional runtime semantic changes. But we're
on a slippery slope to two languages, where programs tested under one
mode fail under the other and interoperation suffers because
different parties do not use the same modes when testing.

I don't want to make this sound like the end of the world -- indeed,
it could be the right thing. But again, it seems like something to
consider, and it may be missing context (for some reading this, if
not for you and me ;-).

Our thought with ES4 has been: If strict mode (modulo eval) is merely
a checker that performs type, name, and other lint-like analyses and
stops certain programs that would run in standard mode from reaching
runtime, but otherwise does not impose runtime semantic changes, then
we probably have fewer interop bugs in the field over the next decade
or two.

4b) At bugs.ecmascript.org/ticket/276#comment:11 I propose that, in strict mode, if a function is called as a function (as opposed to calling it as a method, constructor, or reflectively), then its "this" is bound to undefined. In the absence of strict mode, "this" should be bound to the global object for ES3 compatibility.

Sliding down-slope another little bit.

4c) At the EcmaScript meeting, I proposed that a function called by call, apply, or bind should have "this" bound to the first argument of that call, bind, or apply -- no matter what value that is. In ES3, if the first argument of call or apply is null or undefined, the function's "this" is bound to the global object. (If I recall correctly, there was general agreement with this proposal by itself. If I'm remembering this inaccurately, please correct me.)

The minutes and trac should tell the truth, but I remember general
agreement on the global substitution for null and undefined being
considered a design flaw in ES3, which introduced call and apply.

Although we could also make this difference of behavior conditional on strict mode, I propose that it be unconditional.

Under these proposals,

f.apply(undefined, [a,b]) ==== f(a,b) ==== f.call(undefined, a,b)

True in ES3, so would be good to keep true in any strict mode,
whatever runtime semantic changes it selects from standard mode.

so we have the ability to reflectively call functions either as functions (as above) or as methods. This reflective ability would be present in ses as well. However, considering Cajita as a candidate ses reveals a functionality hole: Because Cajita contains "new" but does not allow manipulation of the prototype chain, Cajita has no ability to reflectively call constructors.

This too has been a topic on list and in wiki (and in Narcissus and
probably other JS-hosted JS implementations). See

www.google.com/search?hl=en&hs=e0L&q=site%3Amail.mozilla.org +newApply

So...

  1. I propose the addition of Function.prototype.newInstance (argList) such that

    f.newInstance([a,b]) ==== new f(a,b)

The proposal I like is ... as a unary prefix "splat" operator that
mimics ...'s usage in rest parameters:

function f(...rest) { return new g(...rest); }

The expression after the splat operator would have to evaluate to an
arguments or array object, call it A; this actual parameter would
have to be last, and it would then supply A.length actuals to the
called function starting with A[0] up to A[A.length-1]. (... is * in
Python for positional parameters).

This has been discussed on-list and if it makes it into ES4, it would
be suboptimal to have f.newInstance too. But I have a bigger concern,
below.

5a) I separately propose that new f(a,b) be considered sugar for f.newInstance([a,b]), so that f can override newInstance to distinguish being invoked as a constructor from being invoked by other means.

This is a nice idea in the abstract, and any middle-aged language
that doesn't already follow the Zen of Python's "There should be
one-- and preferably only one --obvious way to do it" line could
perhaps afford to be expansive, and support both ... as splat and
newInstance.

However, desugaring new into a call to a metaprogramming function
without further security mechanism scares me. Right now a host- provided function object can insist that it must be invoked only via
operator new, and it can make its standard name binding(s) have the
DontDelete and ReadOnly attributes. This is important for integrity
in Firefox and other Gecko-based browsers for certain objects. Such
an object may have been around for years, during which it allowed ad- hoc ("expando") properties to be set on it, without the value of any
such property affecting its constructor-only policy, or what code
runs when it is invoked via operator new. For backward compatibility
it may be important to continue to support such ad-hoc property setting.

Now add newInstance to Function.prototype. Do the high-integrity
constructors delegate to this method when invoked via new? You could
say that such host-provided objects need a host-provided prioritized
constructor protocol, and newInstance would not be consulted by it.
Ok, that might be sufficient. It seems necessary. Not saying
something like this seems like a bad idea. But now you are talking
about more than just a way to compose operator new with apply.

I like the splat proposal because it's orthogonal to constructor and
invocation protocols. It cleanly composes apply and new, as well as
generalizing apply into a special form (unary prefix operator that
can be used only in a trailing argument expression) which can be used
with any callable or constructible object.

# Mark Miller (18 years ago)

thanks for the long and thoughtful answer. I think we have many points of agreement.

I'll be responding to your message point by point soon. Tonight I'll just mention a few that jumped out at me.

On Wed, Feb 20, 2008 at 7:35 PM, Brendan Eich <brendan at mozilla.org> wrote:

Now we could say something about the outer language and the kinds of objects that could be injected. But now the secure dialect in the sandbox is spreading its reference monitor or capability system into the outer language, and that outer language can't be ES3, therefore it can't be ES4-in-full (which is a superset of ES3, modulo de-facto standards fixes).

I do not understand this comment, and it seems crucial that I do. Can you please expand? Thanks.

4c) At the EcmaScript meeting, I proposed that a function called by call, apply, or bind should have "this" bound to the first argument of that call, bind, or apply -- no matter what value that is. In ES3, if the first argument of call or apply is null or undefined, the function's "this" is bound to the global object. (If I recall correctly, there was general agreement with this proposal by itself. If I'm remembering this inaccurately, please correct me.)

The minutes and trac should tell the truth, but I remember general agreement on the global substitution for null and undefined being considered a design flaw in ES3, which introduced call and apply.

That's great. The current behavior created, for Caja, a terrible privilege escalation vulnerability that we have figured out how to plug by unpleasant means we have yet to document or implement. Repairing this behavior will make a future Caja a more tractable and safer piece of engineering. Thanks!

  1. I propose the addition of Function.prototype.newInstance (argList) such that

    f.newInstance([a,b]) ==== new f(a,b)

The proposal I like is ... as a unary prefix "splat" operator that mimics ...'s usage in rest parameters:

function f(...rest) { return new g(...rest); }

When I first saw splat at bugs.ecmascript.org/ticket/276#comment:9 I immediately liked

it. I was enthusiastic about using it in exactly the way you suggest. It's a nice idea -- my compliments. However, it fails the "no new syntax in ES3.1" rule, which I care about even more. Without splat or newInstance, ES3.1 is otherwise reflectively complete re invocation. There's no reason that reflective construction needs new syntax, so it would be a shame to keep it out of ES3.1 only for that reason.

5a) I separately propose that new f(a,b) be considered sugar for f.newInstance([a,b]), so that f can override newInstance to distinguish being invoked as a constructor from being invoked by other means.

This is a nice idea in the abstract, and any middle-aged language that doesn't already follow the Zen of Python's "There should be one-- and preferably only one --obvious way to do it" line could perhaps afford to be expansive, and support both ... as splat and newInstance.

However, desugaring new into a call to a metaprogramming function without further security mechanism scares me. Right now a host- provided function object [...]

Ok, you succeeded in scaring me too. I withdraw the suggestion until I come up with something better thought out. Thanks for catching this issue.

# Brendan Eich (18 years ago)

On Feb 20, 2008, at 6:10 PM, Mike Samuel wrote:

JSON ⊂ ADsafe ⊂ Cajita ⊂ Caja ⊂ ES3 ⊂ ES4

People who know Unicode are dangerous ;).

Yes, we need more of you ;-).

There's three problems according to my reading of http:// www.ietf.org/rfc/rfc4627.txt but only the first is directly related
to syntax:

(1) There are JSON programs that are not valid ES programs. The JSON program [ "\u2028" ] where the unicode escape is replaced
with its literal equivalent is valid according to JSON since the
set of characters that can appear in a string unescaped is unescaped = %x20-21 / %x23-5B / %x5D-10FFFF but ES does not allow codepoint 0x2028 or 0x2029 to appear
unescaped in a string since they are newline characters.

I wonder if JSON should not change on this point. Is there a use-case
for unescaped line/paragraph separators in strings?

(2) There are JSON programs that have the same text as ES programs
but different meaning. ES262 says that all format control codepoints, such as 0x200C,
should be stripped out of the program in a pre-lex phase. This is
not consistently implemented: eval("'\u200c'.length") == 0 on SpiderMonkey, and 1 on most
other interpreters

Not lately, meaning post-Firefox-2/JS1.7. Fresh js shell, same
results for Firefox 3 any beta:

js> eval("'\u200c'.length") == 0

false js> eval("'\u200c'.length")

1

See bugzilla.mozilla.org/show_bug.cgi?id=274152, where
SpiderMonkey yields to IE JScript's flouting of ECMA-262. IE set a
real-world web standard, and for the better according to people in
certain locales.

According to bugzilla.mozilla.org/show_bug.cgi?id=368516#c34,
IE does not report illegal character errors correctly, instead
treating misplaced BOMs as identifiers whose references result in
runtime ReferenceErrors (I don't know what it does with other format- control characters that occur outside of strings and regexps).

See also the follow-on bug to tolerate mislocated BOMs, https:// bugzilla.mozilla.org/show_bug.cgi?id=368516. Ain't the copy/paste
Internet grand?

JSON does not strip these characters out, so they are treated as
significant.

ES4 is specifying as a bug fix to match other browsers that format- control characters shall not be stripped; it must also, to be a real- world web standard, specify tolerance for mislocated BOMs. Postel's
Law bites back!

So JSON and ES4 will agree on this one.

(3) There are JSON programs that can be parsed to ES but that
cannot be serialized back to JSON without losing track of where
info was lost. JSON does not put any limits on numbers, but ES does. ES will
treat 1e1000 as Infinity. Since JSON does not have a value
Infinity, it is unclear how to implement toJSON(fromJSON("[1e1000]")).

JSON's grammar is nice and simple, it facilitates exhaustive testing
(Rob Sayre used Koushik Sen's jCUTE to generate all-paths tests for a
Java implementation).

BigInts or BigNums could help in the future, but the installed base
will not have them for a while and their literal syntax, without a
pragma, will have a suffix.

This kind of edge case is unlikely to be a problem in practice,
although such "overflow" conditions recur throughout the security
exploit literature. Could JSON stand to grow support for the IEEE-754
non-finite values?

# Brendan Eich (18 years ago)

On Feb 20, 2008, at 10:48 PM, Mark Miller wrote:

On Wed, Feb 20, 2008 at 7:35 PM, Brendan Eich <brendan at mozilla.org>
wrote:

Now we could say something about the outer language and the kinds of objects that could be injected. But now the secure dialect in the sandbox is spreading its reference monitor or capability system into the outer language, and that outer language can't be ES3, therefore it can't be ES4-in-full (which is a superset of ES3, modulo de-facto standards fixes).

I do not understand this comment, and it seems crucial that I do. Can you please expand? Thanks.

I'll be concrete and talk about GreaseMonkey. A GM user script is
evaluated in a sandbox, but privileged outer code first injects
certain methods into the sandbox. Those functions delegate to their
prototype for certain properties, notably Function.prototype.apply/ call and the constructor property.

If a GM user script must now be written in a secure dialect, is it
sufficient to ban all writes to computed property names, and to
literal names not on the whitelist?