Block exprs as better object literals (was: Semantics and abstract syntax of lambdas)

# Mark S. Miller (16 years ago)

On Sun, Dec 21, 2008 at 6:22 AM, Dave Herman <dherman at ccs.neu.edu> wrote:

Lex Spoon wrote:

So I would be interested in a simple syntactic form like Lex's suggestion. Imagine for a moment the following idea didn't cause parsing problems (it does, but bear with me). Say we had a sequence-expression form:

{ stmt; ... ; stmt; => expr }

and then add two kinds of block literals, a block-statement form and a sequence-expression form:

^(x1,...,xn) { stmt; ... ; stmt } ^(x1,...,xn) { stmt; ... ; stmt; => expr }

  1. Tail position would only be a property of expressions, not statements [1]. That's simpler.
  2. There's no "accidental return" hazard because you have to explicitly use the => form.
  3. It doesn't break Tennent because the return position is a fixed part of the syntax, not a separate operator whose meaning gets captured by lambda.

It does make lambda-as-control-abstraction a few characters more expensive. More immediately, this notation is obviously wildly conflicting with the object-literal syntax. I probably should leave it to the syntax warriors to figure out how to spell this in unambiguous ASCII. But there's the idea.

To postpone debate about concrete lexical syntax while I make a different point, I will use "reveal" as a keyword meaning approximately what your "=>"

means above; and "lambda" for your "^" above. Given

<block> ::= "{" <statement>* "}" <statement> ::= ... | <declaration> | <expr> | "reveal" "(" <expr> ")" <expr> ::= ... | "lambda" <paramList>? <block> | "let" <initList>? <block> // let(...){...} desugars to (lambda(...){...})()

The notion of "revealed value" would replace the role of completion value from your original lambda proposal. Revealing a value is not a control transfer. It merely stores that value to become the revealed value of that block. A "reveal" in a sub-block does not effect the revealed value of the containing block, and may well be a static error. ("completion value" would, unfortunately, continue to exist for legacy compatibility of uses of "eval", but would have no other role in the language.) Your two examples above become

lambda(x1,...,xn) { stmt; ... ; stmt } // reveals nothing, i.e., like

"reveal (undefined);" lambda(x1,...,xn) { stmt; ... ; reveal (expr) } // reveals value of expr

The "reveal" would not need to go last, but a given block could have a most one: "reveal" "(" <expr> ")".

At esdiscuss/2008-November/008185,

Peter Michaux makes an interesting proposal that I think can be combined with the proposal above, by having "reveal" also do a job similar to Peter's "public". Block-expressions with "reveal"s, could then be used instead of an enhanced object literal for class-like instance declarations. That's why I placed the mandatory "("s in the above use of reveal, so I could make more use of this keyword without ambiguity below. An alternate concrete syntax could use two keywords (perhaps "public"?) or other syntax of course.

<declaration> ::= "var" <ident> ("=" <expr>)? | "const" <ident> (":" <expr>)? "=" <expr> | "let" <ident> (":" <expr>)? "=" <expr> | "function" <ident> <paramList> <block> | "reveal" <declaration> | "reveal" <ident> <paramList>? <block> // desugars to[1] "reveal" "const" <ident> = "lambda" <paramList> <block> // except that the revealed property in non-enumerable

The last two productions introduce the notion of a "revealed declaration". A given block can either

  • reveal nothing (equivalent to "reveal (undefined)"),
  • have exactly one "reveal" "(" <expr> ")",
  • or have any number of revealed declarations

If a block contains revealed declarations, then the block's revealed value is a new non-extensible object whose properties mirror these declared variable names. A revealed "const" is simply a copy of the same value. For "var", "let", and "function", the property is an accessor property without a setter, whose getter gets the named variable. Revealed "function"s and the last (method) production create non-enumerable properties. The others are enumerable. Revealed "var"s create configurable properties. The others are non-configurable. Thus, in the absence of a revealed "var", the revealed object is frozen. Redoing Peter's example[2], we get

const Point = lambda(privX, privY) { let privInstVar = 2; const privInstConst = -2; reveal toString() { reveal ('<' + getX() + ',' + getY() + '>') }; reveal getX() { reveal privX }; reveal getY() { reveal privY }; reveal let pubInstVar = 4; reveal const pubInstConst = -4; }

Revealed declarations always desugar to object creation + property definition + revealing the created object. So the above example would desugar to

const Point = lambda(privX, privY) { let privInstVar = 2; const privInstConst = -2; const toString = lambda() { reveal ('<' + getX() + ',' + getY() + '>') }; const getX = lambda() { reveal (privX) }; const getY = lambda() { reveal (privY) }; let pubInstVar = 4; const pubInstConst = -4; reveal (Object.preventExtensions(Object.create(Object.prototype, { toString: {value: toString}, getX: {value: getX}, getY: {value: getY}, pubInstVar: {get: lambda{reveal (pubInstVar)}, enumerable: true}, pubInstConst: {value: pubInstConst, enumerable: true} }))) }

Not yet addressed:

  • decent concrete lexical syntax
  • how can we name the "self" object being initialized?
  • how can we provide an alternative to Object.prototype to inherit from?
  • how can we simply express revealing a setter as well?

The last three bullets can obviously be handled manually by avoiding the revealed declaration sugar. But typical class-like use will want to combine the convenience of revealed declarations with at least self-naming and alternate parents.

There may well be a fatal flaw with this proposal before we bikeshed its concrete lexical syntax. Let's try to discuss abstract syntax, semantics, and the last three bullets above before we plummet into the concrete lexical syntax bikeshed. Thanks.

[1] Under the assumption that lambda makes frozen closures. If not, then reveal <ident> <paramList>? <block>

should desugar to reveal const <ident> = Object.freeze(lambda <paramList> <block>)

[2] I substituted "getX()" and "getY()" for Peter's "x()" and "y()". I couldn't tell whether Peter meant "x" and "y" to be accessor properties or accessor methods. If accessor properties, their uses shouldn't have end in "()". If accessor methods, I find the verb-names "getX"/"getY" clearer than "x" and "y".

# Mark S. Miller (16 years ago)

On Sun, Dec 21, 2008 at 1:53 PM, Mark S. Miller <erights at google.com> wrote:

reveal getX() { reveal privX }; reveal getY() { reveal privY };

Oops. Should be

  reveal getX() { reveal (privX) };
  reveal getY() { reveal (privY) };
# Yuh-Ruey Chen (16 years ago)

Mark S. Miller wrote:

If a block contains revealed declarations, then the block's revealed value is a new non-extensible object whose properties mirror these declared variable names. A revealed "const" is simply a copy of the same value. For "var", "let", and "function", the property is an accessor property without a setter, whose getter gets the named variable. Revealed "function"s and the last (method) production create non-enumerable properties. The others are enumerable. Revealed "var"s create configurable properties. The others are non-configurable. Thus, in the absence of a revealed "var", the revealed object is frozen. Redoing Peter's example[2], we get

const Point = lambda(privX, privY) { let privInstVar = 2; const privInstConst = -2; reveal toString() { reveal ('<' + getX() + ',' + getY() + '>') }; reveal getX() { reveal privX }; reveal getY() { reveal privY }; reveal let pubInstVar = 4; reveal const pubInstConst = -4; }

Revealed declarations always desugar to object creation + property definition + revealing the created object. So the above example would desugar to

const Point = lambda(privX, privY) { let privInstVar = 2; const privInstConst = -2; const toString = lambda() { reveal ('<' + getX() + ',' + getY() + '>') }; const getX = lambda() { reveal (privX) }; const getY = lambda() { reveal (privY) }; let pubInstVar = 4; const pubInstConst = -4; reveal (Object.preventExtensions(Object.create(Object.prototype, { toString: {value: toString}, getX: {value: getX}, getY: {value: getY}, pubInstVar: {get: lambda{reveal (pubInstVar)}, enumerable: true}, pubInstConst: {value: pubInstConst, enumerable: true} }))) }

TBH, even after substituting more Java-esque keywords like "public" for "reveal", this just doesn't appeal to my aesthetic sense. The user will naturally wonder why he should do |reveal (privX)| rather than |return privX|. I'm not sure what the benefit there is to trying to completely replace functions with lambdas for class methods.

# Mark S. Miller (16 years ago)

On Mon, Dec 22, 2008 at 1:35 PM, Yuh-Ruey Chen <maian330 at gmail.com> wrote:

TBH, even after substituting more Java-esque keywords like "public" for "reveal", this just doesn't appeal to my aesthetic sense. The user will naturally wonder why he should do |reveal (privX)| rather than |return privX|. I'm not sure what the benefit there is to trying to completely replace functions with lambdas for class methods.

Rather than agree or disagree with this, I'd like to separate issues. The previous message bound together two separable ideas: 1) The notion of "blocks with revealed values" as a TCP amenable way to introduce lambdas while dodging Waldemar's leakage hazard. 2) Enhanced blocks with exported declarations as a better object literal, adapting ideas from Peter Michaux's esdiscuss/2008-November/008185.

Here, I'll make a revised proposal involving only #2. It will make no use of lambda or reveal, and desugar only to ES3.1 and non-controversial elements of ES-H (const and let declarations) + a placeholder "let" expression syntax for turning a block into an expression without violating TCP. Again, I focus on abstract syntax and am sloppy with the details of concrete syntax. In the absence of another "reveal" concenpt to harmonize with, I will also go back to use of the familiar "public".

Without "lambda", I take no position as to how one concretely turns a block into an expression without violating TCP. I would still like to see a TCP-respecting "lambda" and "let", but I'm trying to separate these issues as much as I can.

a restatement of the (abstracted) status quo

<block> ::= "{" <statement>* "}" <statement> ::= | <declaration> | <expr> | <otherStatement> # like if, ... <declaration> ::= "var" <ident> ("=" <expr>)? | "const" <ident> (":" <expr>)? "=" <expr> | "let" <ident> (":" <expr>)? "=" <expr> | "function" <ident> <paramList> <block>

the new stuff

<expr> ::= ... | "object" ("implements" <expr>) <objectBody> | "let" "{" <statements>* <expr> "}" # Evaluates to the value of the

last <expr> <objectBody> ::= "{" (<statement> | <member>)* "}" <member> ::= "public" <declaration> | "public" <ident> (":" <expr>)? "=" <expr> | "public" <ident> <paramList> <block>

The sugared form of the previous class-like example, expressed in this language, becomes

const Point = Object,freeze(function(privX, privY) { return object implements Point { let privInstVar = 2; const privInstConst = -2; public toString() { return ('<' + getX() + ',' + getY() + '>'); }; public getX() { return privX; }; public getY() { return privY; }; public let pubInstVar = 4; public pubInstConst = -4; }; });

If we wish, we can still introduce a further "class" syntax which desugars to a frozen function returning an object:

<declaration> ::= ... | "class" <ident> <paramList> <objectBody>

So we could write the above as

class Point(privX, privY) { let privInstVar = 2; ... }

Both of these desugar to

const Point = Object,freeze(function(privX, privY) { return let { let privInstVar = 2; const privInstConst = -2; const toString = Object.freeze(function() { return ('<' + getX() + ',' + getY() + '>'); }); const getX = Object.freeze(function() { return privX; }); const getY = Object.freeze(function() { return privY; }); let pubInstVar = 4; const pubInstConst = -4; Object.freeze(Object.create(Point.prototype, { toString: {value: toString}, getX: {value: getX}, getY: {value: getY}, pubInstVar: {get: function{return pubInstVar;}, enumerable: true}, pubInstConst: {value: pubInstConst, enumerable: true} })) }; });

My previous message mentioned 4 issues it left unaddressed: a) decent concrete lexical syntax b) how can we name the "self" object being initialized? c) how can we provide an alternative to Object.prototype to inherit from? d) how can we simply express revealing a setter as well?

On #a, in the absence of "lambda", I think the above concrete sugared syntax is good, modulo the need to decide on an actual concrete syntax for turning blocks into TCP-respecting expressions.

On #b, you just say const self = object .. { .. }; The const read barrier has exactly the correct effect: An object can define functions that refer to itself during initialization, but it cannot actually use itself during initialization. Further, two objects defined in the same enclosing scope can refer to each other without any imperative nonesense to set up the cycle.

On #c, I revived the earlier notion of "implements" to show how it can be done in the context of this proposal.

#d remains unresolved. I didn't like anything I came up with.

# Mark Miller (16 years ago)

2009/1/5 Mark S. Miller <erights at google.com>:

<member> ::= [...] | "public" <ident> <paramList> <block>

<declaration> ::= ... | "class" <ident> <paramList> <objectBody>

Both classes and methods should have the pleasant property of functions, that their initialization is hoisted to the beginning of the enclosing block, so there is no dead zone read barrier. This means my desugaring should be

const Point = Object.freeze(function(privX, privY) { return let { const toString = Object.freeze(function() { return ('<' + getX() + ',' + getY() + '>'); }); const getX = Object.freeze(function() { return privX; }); const getY = Object.freeze(function() { return privY; }); let privInstVar = 2; const privInstConst = -2; let pubInstVar = 4; const pubInstConst = -4; Object.freeze(Object.create(Point.prototype, { toString: {value: toString}, getX: {value: getX}, getY: {value: getY}, pubInstVar: {get: function{return pubInstVar;}, enumerable: true}, pubInstConst: {value: pubInstConst, enumerable: true} })) }; });

where that desuagring as a whole is likewise hoisted.

Btw, if we leave out the "object" production and desugar the "class" production directly, then we also avoid the need to solve the blocks-as-expression problem, and the desugaring above simplifies to

const Point = Object.freeze(function(privX, privY) { const toString = Object.freeze(function() { return ('<' + getX() + ',' + getY() + '>'); }); const getX = Object.freeze(function() { return privX; }); const getY = Object.freeze(function() { return privY; }); let privInstVar = 2; const privInstConst = -2; let pubInstVar = 4; const pubInstConst = -4; return Object.freeze(Object.create(Point.prototype, { toString: {value: toString}, getX: {value: getX}, getY: {value: getY}, pubInstVar: {get: function{return pubInstVar;}, enumerable: true}, pubInstConst: {value: pubInstConst, enumerable: true} })); });

However, this leaves no way to enable the object to refer to itself without inventing further syntax.

# Mark S. Miller (16 years ago)

On Mon, Jan 5, 2009 at 4:50 AM, Mark Miller <erights at gmail.com> wrote:

Sigh. All occurrences of

 pubInstVar: {get: function{return pubInstVar;},
              enumerable: true},

in my previous messages should be

  pubInstVar: {get: Object.freeze(function{return pubInstVar;}),
               enumerable: true},