What have we done about the mutable global object? (was Re: Es4-discuss Digest, Vol 8, Issue 44)

# Brendan Eich (18 years ago)

On Oct 30, 2007, at 6:14 PM, Brendan Eich wrote:

ES4 provides const, fixed typename bindings, lexical scope (let),
program units, and optional static type checking -- all of which
do make ES4 code easier to analyze and instrument to enforce
security properties.

I left out the intrinsic namespace from the above litany, but Graydon
nicely pointed it out. I wonder if it's understood from the overview,
and from the proposal page:

proposals:intrinsic_namespace

This message tries to summarize how intrinsic and related new ES4
facilities such as fixtures work together to improve integrity
against XSS threats. It's late, I hope I got everything right. My TG1
colleagues will correct me if not. You can test in the RI to see
intrinsic in action -- in particular, there's an intrinsic::print
method, it's the P in our REPL.

The intrinsic namespace evolved in part from the desire to support
opt-in early binding of standard methods, which is a feature of the
ECMA-262 Edition 3 Compact Profile (www.ecma-international.org publications/standards/Ecma-327.htm). Here's how it works:

For each standard prototype method, there is a method with the same
unqualified (local) name, but in the intrinsic namespace. So
Array.prototype.slice is paired with a method named intrinsic::slice
in class Array.

The prototype method typically calls the intrinsic method on its
dynamic |this|, e.g.

   /* E262-3 15.6.4.3: Boolean.prototype.valueOf. */
   prototype function valueOf(this: AnyBoolean)
       this.intrinsic::valueOf();

from builtins/Boolean.es in the RI (note the expression closure
syntax), but for Array, String, and string generics, there's a static
generic method to delegate to from both the prototype and intrinsic
method:

     prototype function slice(start, end)
         Array.slice(this,
                     start === undefined ? 0 : Number(start),
                     end === undefined ? Infinity : Number(end));

     intrinsic function slice(start: AnyNumber=0, end:

AnyNumber=Infinity): Array Array.slice(this, start, end);

(Note the extra special logic in the prototype slice method required
by ES3.)

The upshot is that this pairing of prototype and intrinsic methods is
not expensive in terms of code footprint.

The intrinsic methods are fixtures: DontDelete bindings searched
ahead of the prototype chain.
These methods are type-annotated with more precise argument and
return types, e.g. while Array.prototype.slice has signature

function (start: *, end: *): *

(not spelled out, of course: (start, end) is enough for the formal
parameter list -- this isn't Java :-P), the Array intrinsic::slice
fixed method has type

function (start: AnyNumber, end: AnyNumber): Array

Users opting into early binding use this pragma:

use namespace intrinsic;

at the top of a program unit or block. This opens the intrinsic
namespace so that the name slice in an expression a.slice(i, j),
where a has type Array, will resolve as a.intrinsic::slice(i, j).

Independent from the intrinsic namespace (and optional for
implementations to support or treat as standard mode), but useful as
well, one can say:

use strict;

to run a type checker that also does basic lint-like name-binding and
other sanity checking.

There's still some debate over how much analysis strict mode should
do, but this mode is completely optional at the ES4 implementation's
discretion -- no need for it on cell phones. The only run-time effect
of strict mode is a change to eval, to prevent it from using var,
const, or function against its caller's variable object, while
allowing let. This prevents eval from clobbering caller bindings
(function can replace an existing binding in ES3), and it avoids
injection of novel eval-created bindings into the dynamic scope.

The model for strict mode other than the lint-like sanity checks is
partial evaluation, so there is no difference other than the above
eval restriction in runtime semantics if a program makes it past
strict mode, compared to runtime semantics in standard mode. Thus
strict mode simply prevents certain programs that might run (possibly
even run correctly) from reaching run-time.

It isn't just prototype methods that have intrinsic-qualified fixed
method counterparts: class static methods such as Date.parse, the
Math object's methods, and familiar top-level functions such as
parseInt and hashcode, also have intrinsic counterparts. So too do
the built-in operator generic functions (a.k.a. multimethods), e.g.
intrinsic::=== which goes with intrinsic::hashcode like peas and
carrots: see the Map proposal at doku.php? id=proposals:dictionary.

The benefits include faster method lookups in lightweight
implementations, better type error detection at compile time thanks
to the more precise intrinsic function types, and of course the main
point Graydon made: integrity. You can be sure nothing has replaced a
standard library binding in the mutable global object or a standard
prototype.

Beyond intrinsic, as I mentioned previously, there are tools for
library authors to tame the global object and other mutable object
hazards in the language:

const and block scope (let) provide immutable bindings and block- wise isolation/shadowing.
The global properties Object, Array, etc. are constant in ES4.
Type annotations on function definitions constrain the binding of
the function name to have a compatible type.
const function can be used to prevent the value as well as the type
of the binding from being changed later.
And of course, the static type checker available if an
implementation supports 'use strict' can find type errors, which can
point to security holes.

Verification is a linchpin for further analyses that I expect to be
researched (hybrid information flow, e.g.) on top of ES4, and where
the research yields solid results, we hope to incorporate that
knowledge into future editions of the standard. Reasoning about ES4
code is thus easier, or even possible in the first place, compared to
ES3 code which requires conservative static analysis or else hybrid
static and dynamic (runtime instrumentation) techniques.

All of these facilities were added for several good reasons,
verification but also integrity among them. This was done while
preserving backward compatibility of ye olde mutable global object
and standard prototypes. It may be that these facilities are simply
not well understood. I hope it's clear now that they address, as best
those of us working together in TG1 know how, the hazards of the
mutable global object -- while not breaking the web. Any questions?

On Oct 30, 2007, at 6:14 PM, Brendan Eich wrote:

> ES4 provides const, fixed typename bindings, lexical scope (let),  
> program units, and optional static type checking -- all of which  
> *do* make ES4 code easier to analyze and instrument to enforce  
> security properties.

I left out the intrinsic namespace from the above litany, but Graydon  
nicely pointed it out. I wonder if it's understood from the overview,  
and from the proposal page:

http://wiki.ecmascript.org/doku.php?id=proposals:intrinsic_namespace

This message tries to summarize how intrinsic and related new ES4  
facilities such as fixtures work together to improve integrity  
against XSS threats. It's late, I hope I got everything right. My TG1  
colleagues will correct me if not. You can test in the RI to see  
intrinsic in action -- in particular, there's an intrinsic::print  
method, it's the P in our REPL.

The intrinsic namespace evolved in part from the desire to support  
opt-in early binding of standard methods, which is a feature of the  
ECMA-262 Edition 3 Compact Profile (http://www.ecma-international.org/ 
publications/standards/Ecma-327.htm). Here's how it works:

* For each standard prototype method, there is a method with the same  
unqualified (local) name, but in the intrinsic namespace. So  
Array.prototype.slice is paired with a method named intrinsic::slice  
in class Array.

* The prototype method typically calls the intrinsic method on its  
dynamic |this|, e.g.

         /* E262-3 15.6.4.3: Boolean.prototype.valueOf. */
         prototype function valueOf(this: AnyBoolean)
             this.intrinsic::valueOf();

from builtins/Boolean.es in the RI (note the expression closure  
syntax), but for Array, String, and string generics, there's a static  
generic method to delegate to from both the prototype and intrinsic  
method:

         prototype function slice(start, end)
             Array.slice(this,
                         start === undefined ? 0 : Number(start),
                         end === undefined ? Infinity : Number(end));

         intrinsic function slice(start: AnyNumber=0, end:  
AnyNumber=Infinity): Array
             Array.slice(this, start, end);

(Note the extra special logic in the prototype slice method required  
by ES3.)

The upshot is that this pairing of prototype and intrinsic methods is  
not expensive in terms of code footprint.

* The intrinsic methods are fixtures: DontDelete bindings searched  
ahead of the prototype chain.

* These methods are type-annotated with more precise argument and  
return types, e.g. while Array.prototype.slice has signature

   function (start: *, end: *): *

(not spelled out, of course: (start, end) is enough for the formal  
parameter list -- this isn't Java :-P), the Array intrinsic::slice  
fixed method has type

   function (start: AnyNumber, end: AnyNumber): Array

* Users opting into early binding use this pragma:

   use namespace intrinsic;

at the top of a program unit or block. This opens the intrinsic  
namespace so that the name slice in an expression a.slice(i, j),  
where a has type Array, will resolve as a.intrinsic::slice(i, j).

* Independent from the intrinsic namespace (and optional for  
implementations to support or treat as standard mode), but useful as  
well, one can say:

   use strict;

to run a type checker that also does basic lint-like name-binding and  
other sanity checking.

There's still some debate over how much analysis strict mode should  
do, but this mode is completely optional at the ES4 implementation's  
discretion -- no need for it on cell phones. The only run-time effect  
of strict mode is a change to eval, to prevent it from using var,  
const, or function against its caller's variable object, while  
allowing let. This prevents eval from clobbering caller bindings  
(function can replace an existing binding in ES3), and it avoids  
injection of novel eval-created bindings into the dynamic scope.

The model for strict mode other than the lint-like sanity checks is  
partial evaluation, so there is no difference other than the above  
eval restriction in runtime semantics if a program makes it past  
strict mode, compared to runtime semantics in standard mode. Thus  
strict mode simply prevents certain programs that might run (possibly  
even run correctly) from reaching run-time.

* It isn't just prototype methods that have intrinsic-qualified fixed  
method counterparts: class static methods such as Date.parse, the  
Math object's methods, and familiar top-level functions such as  
parseInt and hashcode, also have intrinsic counterparts. So too do  
the built-in operator generic functions (a.k.a. multimethods), e.g.  
intrinsic::=== which goes with intrinsic::hashcode like peas and  
carrots: see the Map proposal at http://wiki.ecmascript.org/doku.php? 
id=proposals:dictionary.

The benefits include faster method lookups in lightweight  
implementations, better type error detection at compile time thanks  
to the more precise intrinsic function types, and of course the main  
point Graydon made: integrity. You can be sure nothing has replaced a  
standard library binding in the mutable global object or a standard  
prototype.

Beyond intrinsic, as I mentioned previously, there are tools for  
library authors to tame the global object and other mutable object  
hazards in the language:
* const and block scope (let) provide immutable bindings and block- 
wise isolation/shadowing.
* The global properties Object, Array, etc. are constant in ES4.
* Type annotations on function definitions constrain the binding of  
the function name to have a compatible type.
* const function can be used to prevent the value as well as the type  
of the binding from being changed later.
* And of course, the static type checker available if an  
implementation supports 'use strict' can find type errors, which can  
point to security holes.

Verification is a linchpin for further analyses that I expect to be  
researched (hybrid information flow, e.g.) on top of ES4, and where  
the research yields solid results, we hope to incorporate that  
knowledge into future editions of the standard. Reasoning about ES4  
code is thus easier, or even possible in the first place, compared to  
ES3 code which requires conservative static analysis or else hybrid  
static and dynamic (runtime instrumentation) techniques.

All of these facilities were added for several good reasons,  
verification but also integrity among them. This was done while  
preserving backward compatibility of ye olde mutable global object  
and standard prototypes. It may be that these facilities are simply  
not well understood. I hope it's clear now that they address, as best  
those of us working together in TG1 know how, the hazards of the  
mutable global object -- while not breaking the web. Any questions?

/be