SES Progress (was: Fwd: AST in JSON format)

# Mark S. Miller (16 years ago)

[Resending, in order to change the subject. Please reply to this one.]

On Tue, Dec 8, 2009 at 8:51 PM, Mark S. Miller <erights at google.com> wrote:

On Tue, Dec 8, 2009 at 7:59 PM, Oliver Hunt <oliver at apple.com> wrote:

Providing an AST doesn't get you anything substantial here as the hard part of all this is validation, not parsing.

Given ES5 as a starting point,

validation for many interesting purposes, especially security, is no longer hard,

the subset restrictions need no longer be severe, and

the issue isn't what's hard but what's slow and large. Lexing and parsing JS accurately is slow. Accurate JS lexers and parsers are large. Even if JS is now fast enough to write a parser competitive with the one built into the browsers, this parser would itself need to be downloaded per frame. Even if all downloads of the parser code hit on the browser's cache, the parser would still need to be parsed per frame that needed it (unless browsers cache a frame-independent parsed representation of JS scripts).

I am currently working on just such a validator and safe execution environment -- assuming ES5 and a built in parser->AST. Going out on a limb, I expect it to have a small download, a simple translation, no appreciable code expansion, and no appreciable runtime overhead. Once I've posted it, we can reexamine my claims above against it.

Work in progress is at < code.google.com/p/es-lab/source/browse/trunk/src/ses>.

This SES implementation is not actually quite complete yet. Even once it seems complete, we can't test it until there is an available ES5 implementation we can try running it on. However, it is complete enough that we're more confident about what it will be, once bugs are fixed, that we can try assessing the limb I climbed out on above. Though, until it is tested, all this should still be taken with some salt.

SES can all be implemented in any ES5 implementation satisfying a few additional constraints that we're trying to accumulate at < code.google.com/p/es-lab/wiki/SecureableES5>.
The implementation sketch shown there depends on two elements not currently provided by ES5 or the browser:
- A Parser->AST API, for which Tom wrote an OMeta/JS parser at <

code.google.com/p/es-lab/source/browse/trunk/src/parser/es5parser.ojs>

that does run in current JavaScript, producing the ASTs described at < code.google.com/p/es-lab/wiki/JsonMLASTFormat> and available at <

es-lab.googlecode.com/svn/trunk/site/esparser/index.html>.

An object-identity-based key/value table, such as the EphemeronTables from the weak-pointer strawman. Assuming tables made the SES runtime initialization a touch easier to write. But I can (and probably should) refactor the SES runtime initialization so that it does not use such a table, just to see how adequate ES5 itself already is at supporting SES.
Given a parser, the rest of the SES startup is indeed small and fast. For an SES in which the parser needs to be provided in JS, the parser will dominate the size of the SES runtime library. The SES verifier is really trivial by comparison with any accurate JS parser.
We are enumerating the subset restrictions imposed by SES at < code.google.com/p/es-lab/wiki/SecureEcmaScript>. For JS code that is

already written to widely accepted best practice, such as no monkey patching of primordials, I would guess these restrictions to be quite reasonable. This is the area where we need the most feedback -- is there any less restrictive or more pleasant object-capability subset of ES5 than the one described here?

Because this SES implementation is verification-based, not translation-based, there is no code expansion.
SES does no blacklisting and no runtime whitelisting. It does all its whitelisting only at initialization and verification time.
Aside from "eval", "Function", "RegExp.prototype.exec", "RegExp.prototype.test", and initialization of the SES runtime (which, given a built-in parser, should be fast), SES has no runtime overhead. This also applies to "eval" and "Function" themselves. All their overhead is in starting up. Given a fast parser, this startup overhead should be small. After startup, code run by either eval or Function has no remaining runtime overhead.

[Resending, in order to change the subject. Please reply to this one.]

On Tue, Dec 8, 2009 at 8:51 PM, Mark S. Miller <erights at google.com> wrote:

> On Tue, Dec 8, 2009 at 7:59 PM, Oliver Hunt <oliver at apple.com> wrote:
> > Providing an AST doesn't get you anything substantial here as
> > the hard part of all this is validation, not parsing.
>
> Given ES5 as a starting point,
> 1) validation for many interesting purposes, especially security, is
> no longer hard,
> 2) the subset restrictions need no longer be severe, and
> 3) the issue isn't what's hard but what's slow and large. Lexing and
> parsing JS accurately is slow. Accurate JS lexers and parsers are
> large. Even if JS is now fast enough to write a parser competitive
> with the one built into the browsers, this parser would itself need to
> be downloaded per frame. Even if all downloads of the parser code hit
> on the browser's cache, the parser would still need to be parsed per
> frame that needed it (unless browsers cache a frame-independent parsed
> representation of JS scripts).
>
> I am currently working on just such a validator and safe execution
> environment -- assuming ES5 and a built in parser->AST. Going out on a
> limb, I expect it to have a small download, a simple translation, no
> appreciable code expansion, and no appreciable runtime overhead. Once
> I've posted it, we can reexamine my claims above against it.
>
>
Work in progress is at <
http://code.google.com/p/es-lab/source/browse/trunk/src/ses/>.

This SES implementation is not actually quite complete yet. Even once it
seems complete, we can't test it until there is an available ES5
implementation we can try running it on. However, it is complete enough that
we're more confident about what it will be, once bugs are fixed, that we can
try assessing the limb I climbed out on above. Though, until it is tested,
all this should still be taken with some salt.

* SES can all be implemented in any ES5 implementation satisfying a few
additional constraints that we're trying to accumulate at <
http://code.google.com/p/es-lab/wiki/SecureableES5>.

* The implementation sketch shown there depends on two elements not
currently provided by ES5 or the browser:
    * A Parser->AST API, for which Tom wrote an OMeta/JS parser at <
http://code.google.com/p/es-lab/source/browse/trunk/src/parser/es5parser.ojs>
that does run in current JavaScript, producing the ASTs described at <
http://code.google.com/p/es-lab/wiki/JsonMLASTFormat> and available at <
http://es-lab.googlecode.com/svn/trunk/site/esparser/index.html>.
   * An object-identity-based key/value table, such as the EphemeronTables
from the weak-pointer strawman. Assuming tables made the SES runtime
initialization a touch easier to write. But I can (and probably should)
refactor the SES runtime initialization so that it does not use such a
table, just to see how adequate ES5 itself already is at supporting SES.

* Given a parser, the rest of the SES startup is indeed small and fast. For
an SES in which the parser needs to be provided in JS, the parser will
dominate the size of the SES runtime library. The SES verifier is really
trivial by comparison with any *accurate* JS parser.

* We are enumerating the subset restrictions imposed by SES at <
http://code.google.com/p/es-lab/wiki/SecureEcmaScript>. For JS code that is
already written to widely accepted best practice, such as no monkey patching
of primordials, I would guess these restrictions to be quite reasonable.
This is the area where we need the most feedback -- is there any less
restrictive or more pleasant object-capability subset of ES5 than the one
described here?

* Because this SES implementation is verification-based, not
translation-based, there is no code expansion.

* SES does no blacklisting and no runtime whitelisting. It does all its
whitelisting only at initialization and verification time.

* Aside from "eval", "Function", "RegExp.prototype.exec",
"RegExp.prototype.test", and initialization of the SES runtime (which, given
a built-in parser, should be fast), SES has *no* runtime overhead. This also
applies to "eval" and "Function" themselves. All their overhead is in
starting up. Given a fast parser, this startup overhead should be small.
After startup, code run by either eval or Function has no remaining runtime
overhead.

-- 
   Cheers,
   --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20100109/93a1a393/attachment-0001.html>