Classes as Sugar -- old threads revisited

# Mark S. Miller (15 years ago)

At the Harmony portion of the recent EcmaScript meeting, we took up the classes as sugar discussion. For background, the relevant recent threads on es-discuss are:

"Look ma, no this" thread starting at esdiscuss/2008-November/008181

"How much sugar do classes need?" thread starting at esdiscuss/2008-November/008181 especially Peter Michaux's esdiscuss/2008-November/008185

"Block exprs as better object literals" thread starting at esdiscuss/2008-December/008521 continuing in January starting at esdiscuss/2009-January/008525

The discussion was a bit frustrating because the committee was largely unaware of the last two of these threads. Although I do recommend reviewing them, I will here restate and refine my last suggestion in a self contained pair of messages. Again, I would like to thank Peter Michaux for the general approach.

Since the proposal I'm about to make will be in terms of a desugaring to other elements of ES-Harmony, In this first message I will recap and speculate on the needed elements of ES-Harmony. Since "lambda" is still controversial, I will avoid it. But I will introduce instead the minimal replacement I still need -- a "let" expression which respects Tennent Correspondence. I would hope that this "let" actually desugars to "lambda", but I do not assume this here. If we do adopt "lambda", then all the desugaring I present to functions should instead be to lambdas.

We need to refactor the ES5 grammar a bit. ES5 has two statement-level declaration productions, VariableStatement and FunctionDeclaration. In ES5, VariableStatement is included in Statement, whereas FunctionDeclaration is included in SourceElement, which reads

SourceElement: // ES5 Statement | FunctionDeclaration

The reason is that ES5 prohibits FunctionDeclarations in nested blocks, permitting them only at the top level of Program and FunctionBody.

Since ES-Harmony will allow lexically nested FunctionDeclarations as well as "let" and "const" declarations, all with proper block-level lexical scope, let's rename VariableStatement to VariableDeclaration and refactor the grammar as:

Declaration: VariableDeclaration | FunctionDeclaration | ConstFunctionDeclaration | LetDeclaration | ConstDeclaration | ClassDeclaration

with ClassDeclaration explained in the next message. The important point for now is that ClassDeclaration desugars to, in effect, declare a function. Like a harmonious FunctionDeclaration, the name declared by a ClassDeclaration has proper block-level lexical scope, and the initialization of this name to the function is hoisted to the beginning of the block so no uninitialized state is observable.

LetDeclaration: "let" Identifier (":" Expression)_opt "=" Expression

ConstDeclaration: "const" Identifier (":" Expression)_opt "=" Expression

The optional (":" Expression) is for dynamic type checking, where the expression is evaluated to a guard value in the current lexical scope, and that guard value is somehow used to represent type-like constraints on the values that may be bound to the variable it guards.

Statement: // current Statement contents without VariableStatement

SourceElement: Statement | Declaration

Block: "{" SourceElements_opt "}"

MemberExpression: // current MemberExpression contents | LetExpression | ObjectExpression

with ObjectExpression explained in the next message.

LetExpression: "let" Bindings_opt "{" SourceElements_opt Expression "}"

For purposes of this note, we can assume Bindings_opt is absent. Likewise, this note has no need for a LetStatement.

The semantics of the LetExpression (due, IIRC, to a suggestion of Dave Herman) is to evaluate the parts between the curlies as a nested block, where the value of the LetExpression is the value of the terminal Expression in its body.

The ConstFunctionDeclaration above elaborates on someone's suggestion, I forget who, for a syntax like:

ConstFunctionDeclaration: "const" Identifier "(" FormalParameterList_opt ")" "{" FunctionBody "}"

This is just like the FunctionDeclaration syntax except for the use of "const" in the position where "function" normally appears. Like a FunctionDeclaration and a ClassDeclaration, a ConstFunctionDeclaration declares a block-scoped hoisted function. For example,

const foo(p1, p2) { body; }

desugars to

const foo = Object.freeze(function foo(p1, p2) { body; }); Object.freeze(foo.prototype);

where this pair is hoisted to the top of its enclosing block. It would have been more elegant if we could have desugared directly to a FunctionDeclaration, in order to reuse the latter's hoisting machinery. However, we can't since the variable introduced by FunctionDeclaration is mutable ("let"-like) and there's no way to get the freezing of the function or its prototype to be automagically hoisted as well. Hopefully the introduction of "lambda" will provide a more elegant way to address this need. If neither lambda nor ConstFunctionDeclaration are accepted into ES-Harmony, then consider the latter as only an explanatory device for other desugarings to be presented shortly.

# Mark S. Miller (15 years ago)

Part 2 of 2

Since a class declaration is like a function declaration, we use a similar syntax:

ClassDeclaration: "class" Identifier "(" FormalParameterList_opt ")" "{" ObjectBody "}"

In this note, we explain the desugaring of ClassDeclaration via an intermediate desugaring to ObjectExpression.

ObjectExpression: "object" ("implements" Expression)_opt "{" ObjectBody "}"

where

class Foo(p1, p2) { ... }

desugars to

const Foo(p1, p2) { return object implements Foo { ... }; }

which desugars to

const Foo = Object.freeze(function Foo(p1, p2) { return object implements Foo { ... }; });

As with ConstFunctionDeclaration, if ObjectDeclaration is not accepted into ES-Harmony, then consider it only an explanatory device.

ObjectBody: Statement | Declaration | "public" Declaration | "public" Identifier (":" Expression)_opt "=" Expression | "public" Identifier "(" FormalParameterList_opt ")" "{" FunctionBody "}"

where

public x :T = y;

desugars to

public const x :T = y;

and

public foo(p1, p2) { body; }

desugars to

public const foo(p1, p2) { body; }

Revisiting Peter's example,

class Point(privX, privY) { let privInstVar = 2; const privInstConst = -2; public toString() { return ('<' + getX() + ',' + getY() + '>'); }; public getX() { return privX; }; public getY() { return privY; }; public let pubInstVar = 4; public pubInstConst = -4; }

desugars to

const Point(privX, privY) { return object implements Point { let privInstVar = 2; const privInstConst = -2; public toString() { return ('<' + getX() + ',' + getY() + '>'); }; public getX() { return privX; }; public getY() { return privY; }; public let pubInstVar = 4; public pubInstConst = -4; }; });

which desugars to a hoisted

const Point = Object.freeze(function(privX, privY) { return object implements Point { let privInstVar = 2; const privInstConst = -2; public const toString() { return ('<' + getX() + ',' + getY() + '>'); }; public const getX() { return privX; }; public const getY() { return privY; }; public let pubInstVar = 4; public pubInstConst = -4; }; }); Object.freeze(Point.prototype);

The remaining elements needing explanation, ObjectExpression, ObjectBody, and

"public" Declaration

desugar together into a LetExpression whose final expression is created by gathering together representatives of the "public" declarations. The intent of the "implements" clause is that the value of the object expression be tagged somehow with an unforgeable nominal type, such that this value is able to pass the corresponding guard. For now, I will take a shortcut and assume that when a guard-value is a function, that the dynamic type-like test is approx

isFrozenProp(guard, 'prototype') && (specimen instanceof guard)

If the function's 'prototype' property is frozen, then instanceof is at least a monotonic test. However, it is effectively forgeable -- it guarantees no useful property -- since anyone may create an object that passes this test but has arbitrarily weird behavior. (Thanks to Waldemar for emphasizing this point at our last meeting.) In order to have a high integrity desugaring of ClassDeclaration or ObjectExpression, we need better lower level support for some kind of trademarking mechanism. We will need to revisit this issue, but not in this note.

With this caveat, our example further desugars to

const Point = Object.freeze(function(privX, privY) { return let { // hoisted functions first const toString = Object.freeze(function() { return ('<' + getX() + ',' + getY() + '>'); }); Object.freeze(toString.prototype); const getX = Object.freeze(function() { return privX; }); Object.freeze(getX.prototype); const getY = Object.freeze(function() { return privY; }); Object.freeze(getY.prototype);

  let privInstVar = 2;
  const privInstConst = -2;
  let pubInstVar = 4;
  const pubInstConst = -4;

  Object.freeze(Object.create(Point.prototype, {
    toString: {value: toString},
    getX: {value: getX},
    getY: {value: getY},
    pubInstVar: {get: Object.freeze(function{return pubInstVar;}),
                 enumerable: true},
    pubInstConst: {value: pubInstConst,
                   enumerable: true}
  }))
};

}); Object.freeze(Point.prototype);

Actually, I cheated above. Notice the lack of an "enumerable: true" in the properties representing toString, getX, and getY. Rather than consider

"public" Identifier "(" FormalParameterList_opt ")" "{" FunctionBody "}"

equivalent to

"public" "const" Identifier ...

consider it instead to be almost identical but representing a method definition. As a method definition, it makes sense (to me at least) to suppress its enumerability.

By considering this syntactic form to represent a distinct method definition production, we can almost cleanly address another of Waldemar's concerns. Within the FunctionBody of a method production, we can rename all free "this"s to refer to the object being made. For example

object { public getMe() { return this; }}

could desugar to

let { const t1 = object { public getMe() { return t1; }}; t1 }

where t1 is a variable name not otherwise used in the ObjectExpression. This would desugar to

let { const t1 = let { const getMe = Object.freeze(function getMe() { return t1; }); Object.freeze(getMe.prototype); Object.freeze(Object.prototype, { getMe: {value: getMe} }) } t1 }

The remaining problem left unaddressed by this proposal is that it creates an unmet need for an analogous private method production, where "this" is analogously renamed.

# Cormac Flanagan (15 years ago)

Mark,

Thanks for clarifying this design.

One important issue to consider is the goal of classes as high-integrity, unforgeable objects. It seems the desugaring of "class Foo" to "object implements Foo" loses this ability ...

class Foo(p1, p2) { ... }

desugars to

const Foo(p1, p2) {    return object implements Foo { ... };  }

since other code can later do:

object implements Foo { ... weird behavior here ... }

Is there some way to strengthen this behavior?

(some other comments are included below.)

  • Cormac

which desugars to

const Foo = Object.freeze(function Foo(p1, p2) {    return object implements Foo { ... };  });

Plus:

Object.freeze(Foo.prototype);

right?

As with ConstFunctionDeclaration, if ObjectDeclaration is not accepted

You mean ObjectExpression, IIUC.

# Cormac Flanagan (15 years ago)

Mark,

Thanks for clarifying this design.

One important issue to consider is the goal of classes as high-integrity, unforgeable objects. It seems the desugaring of "class Foo" to "object implements Foo" loses this ability ...

class Foo(p1, p2) { ... }

desugars to

const Foo(p1, p2) { return object implements Foo { ... }; }

since other code can later do:

object implements Foo { ... weird behavior here ... }

Is there some way to strengthen this behavior?

(some other comments are included below.)

  • Cormac

which desugars to

const Foo = Object.freeze(function Foo(p1, p2) { return object implements Foo { ... }; });

Plus:

Object.freeze(Foo.prototype);

right?

As with ConstFunctionDeclaration, if ObjectDeclaration is not accepted

You mean ObjectExpression, IIUC.

# P T Withington (15 years ago)

These two are reversed, aren't they? The const should have the getter
rather than the value?

# Mark S. Miller (15 years ago)

On Tue, May 26, 2009 at 3:08 PM, P T Withington <ptw at pobox.com> wrote:

These two are reversed, aren't they? The const should have the getter rather than the value?

On 2009-03-30, at 22:41EDT, Mark S. Miller wrote:

   pubInstVar: {get: Object.freeze(function{return pubInstVar;}),
               enumerable: true},
  pubInstConst: {value: pubInstConst,
                 enumerable: true}

No, it's correct as is. With an initialized const variable, its value can't change, so we can just present the value directly as the value of a frozen data property. A let variable, on the other hand, can continue to change. Thus, in order for the property to continue to present the variable's current value we need a getter.

# Mark Miller (15 years ago)

On Tue, May 26, 2009 at 2:49 PM, Cormac Flanagan <cormac at cs.ucsc.edu> wrote:

Mark,

Thanks for clarifying this design.

One important issue to consider is the goal of classes as high-integrity, unforgeable objects. It seems the desugaring of "class Foo" to "object implements Foo" loses this ability ...

class Foo(p1, p2) { ... }

desugars to

const Foo(p1, p2) {    return object implements Foo { ... };  }

since other code can later do:

object implements Foo { ... weird behavior here ... }

Yes, as explained: On Mon, Mar 30, 2009 at 7:41 PM, Mark S. Miller <erights at google.com> wrote:

The intent of the "implements" clause is that the value
of the object expression be tagged somehow with an unforgeable nominal
type, such that this value is able to pass the corresponding
guard. For now, I will take a shortcut and assume that when a
guard-value is a function, that the dynamic type-like test is approx

 isFrozenProp(guard, 'prototype') && (specimen instanceof guard)

If the function's 'prototype' property is frozen, then instanceof is
at least a monotonic test. However, it is effectively forgeable -- it
guarantees no useful property -- since anyone may create an object
that passes this test but has arbitrarily weird behavior. (Thanks to
Waldemar for emphasizing this point at our last meeting.) In order to
have a high integrity desugaring of ClassDeclaration or
ObjectExpression, we need better lower level support for some kind of
trademarking mechanism. We will need to revisit this issue, but not in
this note.

Is there some way to strengthen this behavior?

AFAICT, not using only the mechanisms found in ES5. Yes, given aforementioned support for some kind of new primitive trademarking mechanism, as found in Gedanken www.erights.org/history/morris73.pdf or W7 mumble.net/~jar/pubs/secureos. Section 6.3 & Figure 6.6 of erights.org/talks/thesis explain trademarking in E as a step

towards E's auditors erights.org/elang/kernel/auditors, wiki.erights.org/w/index.php?title=Guard-based_auditing,

which we may want to consider.

which desugars to

const Foo = Object.freeze(function Foo(p1, p2) {    return object implements Foo { ... };  });

Plus:

Object.freeze(Foo.prototype);

right?

Yes, good catch!

As with ConstFunctionDeclaration, if ObjectDeclaration is not accepted

You mean ObjectExpression, IIUC.

Oops. Yes.