Unifying Block and ObjectLiteral (was: Re: block-lambda revival)
Ooo that's exciting! And so, if I'm not being too presumptuous, Does this mean that constructions like if, while, etc become prefix operators that can invoke a block lambda? I've been flat out and haven't been able to look at this so it''s exciting to see some progress on it. Just in case: I hereby relinquish all copyright, trademark and patent rights I may possibly hold, to the idea of "unifying the object literal and block grammar constructions" to the TC39 group and its constituent members, so help me god. -Breton Slivka
On Jun 28, 2011, at 4:35 PM, Breton Slivka wrote:
Ooo that's exciting! And so, if I'm not being too presumptuous, Does this mean that constructions like if, while, etc become prefix operators that can invoke a block lambda?
There's no point redoing the built-in statements that way, and we cannot handle the "else" keyword or "while" in do-while loops without yet more magic. I don't think it's worth it, yet.
I've been flat out and haven't been able to look at this so it''s exciting to see some progress on it. Just in case: I hereby relinquish all copyright, trademark and patent rights I may possibly hold, to the idea of "unifying the object literal and block grammar constructions" to the TC39 group and its constituent members, so help me god.
No worries. And note that I did not unify block with object literal so much as disambiguate them. But I think there's an LR(1) ambiguity (fixable) still. More in a bit.
On Jun 28, 2011, at 3:10 PM, Brendan Eich wrote:
Block: { UnlabeledStatementFirstList } { WellLabeledStatement StatementList? }
UnlabeledStatementFirstList: UnlabeledStatement UnlabeledStatementFirstList Statement
Statement: UnlabeledStatement LabeledStatement
UnlabeledStatement: Block VariableStatement EmptyStatement ExpressionStatement ContinueStatement ReturnStatement LabelUsingStatement DebuggerStatement
LabelUsingStatement: IfStatement IterationStatement BreakStatement WithStatement SwitchStatement ThrowStatement TryStatement
LabeledStatement: Identifier : Statement
WellLabeledStatement: Identifier : LabelUsingStatement Identifier : WellLabeledStatement
...
If I have this right, then we could add a new production:
PrimaryExpression: Block
This creates an LR(1) ambiguity, a shift-reduce conflict between shifting : in
WellLabeledStatement : Identifier.: LabelUsingStatement WellLabeledStatement : Identiifer.: WellLabeledStatement
and reducing on : lookahead via
PropertyName : Identifier.
under PropertyAssignment under ObjectLiteral.
(The dots in the productions are yacc/bison-style cursors showing where the parse is.)
One easy fix is to "de-minimize" the PropertyAssignment grammar:
PropertyAssignment: PropertyName : AssignmentExpression get PropertyName ( ) { FunctionBody } set PropertyName ( PropertySetParameterList ) { FunctionBody }
PropertyName: IdentifierName StringLiteral NumericLiteral
to look like this:
PropertyAssignment: IdentifierName : AssignmentExpression StringLiteral : AssignmentExpression NumericLiteral : AssignmentExpression get PropertyName ( ) { FunctionBody } set PropertyName ( PropertySetParameterList ) { FunctionBody }
PropertyName: IdentifierName StringLiteral NumericLiteral
This works because the lookahead after : in this revised PropertyAssignment and in WellLabeledStatement is disjoint except for { and that loops back through the same paths, with the tie broken by a disjoint lookahead token, eventually.
Is this grammar refactoring worth the trouble? Making blocks be expressions could be well worth it if we accept block-lambdas. Just for statements as expressions, I'm not sure. Comments welcome.
On Jun 30, 2011, at 3:47 PM, Brendan Eich wrote:
Statement: UnlabeledStatement LabeledStatement
UnlabeledStatement: Block VariableStatement EmptyStatement ExpressionStatement ContinueStatement ReturnStatement LabelUsingStatement DebuggerStatement
LabelUsingStatement: IfStatement IterationStatement BreakStatement WithStatement SwitchStatement ThrowStatement TryStatement
I forgot to move Block from UnlabeledStatement to LabelUsingStatement. You can label a block. It's something people do now for break to label. And with the revisions below, it does not create any ambiguity.
2011/6/28 Brendan Eich <brendan at mozilla.com>:
On Jun 23, 2011, at 3:27 PM, Brendan Eich wrote: Block: { UnlabeledStatementFirstList } { WellLabeledStatement StatementList? }
If IfStatement is still defined
IfStatement : if ( Expression ) Statement else Statement if ( Expression ) Statement
then does this definition of Statement and Block change the program
if (foo) { bar(); } else {}
to be equivalent to
if (foo) { bar(); } else ({})
?
This is arguably a semantic change, but the change cannot be detectable by anything else the program could do.
On Jun 30, 2011, at 6:32 PM, Mike Samuel wrote:
2011/6/28 Brendan Eich <brendan at mozilla.com>:
On Jun 23, 2011, at 3:27 PM, Brendan Eich wrote: Block: { UnlabeledStatementFirstList } { WellLabeledStatement StatementList? }
If IfStatement is still defined
IfStatement : if ( Expression ) Statement else Statement if ( Expression ) Statement
then does this definition of Statement and Block change the program
if (foo) { bar(); } else {}
to be equivalent to
if (foo) { bar(); } else ({})
?
No, remember I wrote "We'd still need the [lookahead ∉ {{, function}] restriction in ExpressionStatement."
This is arguably a semantic change, but the change cannot be detectable by anything else the program could do.
That's not true if the {} in statement context were an expression whose result was discarded. In such an event (without the lookahead restriction and with some other restriction to disambiguate away from block-as-statement to block-as-expression), the completion value would differ from in ECMA-262 as it is, and eval or an embedding (javascript: URLs, e.g.) could tell.
But that empty block else clause is not a block-as-expression: it's a block-as-statement, so same semantics as in ECMA-262 as-is.
2011/6/30 Brendan Eich <brendan at mozilla.com>:
On Jun 30, 2011, at 6:32 PM, Mike Samuel wrote:
2011/6/28 Brendan Eich <brendan at mozilla.com>: No, remember I wrote "We'd still need the [lookahead ∉ {{, function}] restriction in ExpressionStatement."
Ok, then by this redefinition of Block, if (foo()) {} is not a valid program because it is not possible for a Block to not contain any statements. Neither UnlabeledStatementFirstList nor WellLabeledStatement match the empty string.
That's not true if the {} in statement context were an expression whose result was discarded. In such an event (without the lookahead restriction and with some other restriction to disambiguate away from block-as-statement to block-as-expression), the completion value would differ from in ECMA-262 as it is, and eval or an embedding (javascript: URLs, e.g.) could tell. But that empty block else clause is not a block-as-expression: it's a block-as-statement, so same semantics as in ECMA-262 as-is.
Ah, I completely forgot the completion value. Yes. typeof eval("if (true) {}") === "object" would distinguish.
On Jun 30, 2011, at 8:56 PM, Mike Samuel wrote:
2011/6/30 Brendan Eich <brendan at mozilla.com>:
On Jun 30, 2011, at 6:32 PM, Mike Samuel wrote:
2011/6/28 Brendan Eich <brendan at mozilla.com>: No, remember I wrote "We'd still need the [lookahead ∉ {{, function}] restriction in ExpressionStatement."
Ok, then by this redefinition of Block, if (foo()) {} is not a valid program because it is not possible for a Block to not contain any statements. Neither UnlabeledStatementFirstList nor WellLabeledStatement match the empty string.
Oops -- thanks. I will fix in a strawman that captures all of this.
On Jun 30, 2011, at 9:46 PM, Brendan Eich wrote:
Oops -- thanks. I will fix in a strawman that captures all of this.
Done:
What is
({ get x() { return 42; } })
?
Could it match both as an object literal with a getter
({ get x() { return 42; } })
or as a block with 3 statements?
({ get; x(); { return 42; } })
2011/7/1 Brendan Eich <brendan at mozilla.com>:
On Jul 1, 2011, at 10:51 AM, Mike Samuel wrote:
What is
({ get x() { return 42; } })
?
Could it match both as an object literal with a getter
({ get x() { return 42; } })
Only that.
or as a block with 3 statements?
({ get; x(); { return 42; } })
No, never. You're forgetting the first rule of ASI fight-club: if there's no error there is no semicolon insertion.
2011/7/1 Brendan Eich <brendan at mozilla.com>:
On Jul 1, 2011, at 10:51 AM, Mike Samuel wrote:
What is
({ get x() { return 42; } })
?
Could it match both as an object literal with a getter
({ get x() { return 42; } })
Only that.
or as a block with 3 statements?
({ get; x(); { return 42; } })
No, never. You're forgetting the first rule of ASI fight-club: if there's no error there is no semicolon insertion.
Fair enough.
So, a next step is to look at this in combination with other Harmony proposals. In particular, the object literal extensions.
There is one pretty obvious ambiguity introduced by the methods property shorthand.
Keep mind that property names are permitted to be keywords. So
let ambig1 = {function () {}}; //object with method named 'function' or 0 argument lambda that returns a function object? let ambig2 = {if (x) {5}}; //object with method named 'if' or 0 argument lambda that returns either 5 or undefined depending value of free variable x?
I don't think we want to loose the conciseness of method property shorthand, but I we can resolve this one by restricting the property name to being an identifier rather than an identifierName in the method property shorthand. Presumably if you really wanted to have a method property named function or if (or any other keyword) you could use a StringLiteral property name in the object literal:
let obj1 = {"function" () {}}; //object with method named 'function' let obj2 = {"if" (x) {5}}; //object with method named 'if'
However, are we going to have lookup issues with StringLiteral and NumericLiteral method property name? "if"(x) or 123() are syntactically valid call expressions so we have to see the { after the ) before we know it is an object literal and not a block lambda.
Property value short hand is also a problem:
let x=2; let ambig3 = {x}; //is this an object literal equivalent to {x:x} or a 0 argument block lambda that returns the value of x?
I don't see a easy way to resolve this one other than dropping property value shorthand. I'd be willing to sacrifice them in order to get block lambdas
I don't think the ! and ~ property attribute prefixes cause any ambiguity issues:
let obj3 = {~x: 5}; let obj4= (!z(a,b,c,d) {}};
however, it does further increase the lookahead necessary be decide between an initial method property or an initial expression statement. This might be enough to reexamine the use of these attribute prefixes. I still think we need attribute control functionality in object literals but I think I would be willing to give up on them for the method property shorthand if that would help. Essentially say that you have to use an old-fashioned property : property assignment if you want to specify non-default method attributes:
let obj4={!z: function(a,b,c,d) {}};
On Jul 6, 2011, at 3:19 PM, Allen Wirfs-Brock wrote:
So, a next step is to look at this in combination with other Harmony proposals. In particular, the object literal extensions.
Yes, I've been dividing and conquering ;-). Thanks for the followup.
There is one pretty obvious ambiguity introduced by the methods property shorthand.
Keep mind that property names are permitted to be keywords. So
let ambig1 = {function () {}}; //object with method named 'function' or 0 argument lambda that returns a function object?
Never the latter, anonymous function expressions are not produced by Statement.
let ambig2 = {if (x) {5}}; //object with method named 'if' or 0 argument lambda that returns either 5 or undefined depending value of free variable x?
Quite.
I don't think we want to loose the conciseness of method property shorthand, but I we can resolve this one by restricting the property name to being an identifier rather than an identifierName in the method property shorthand. Presumably if you really wanted to have a method property named function or if (or any other keyword) you could use a StringLiteral property name in the object literal:
let obj1 = {"function" () {}}; //object with method named 'function' let obj2 = {"if" (x) {5}}; //object with method named 'if'
However, are we going to have lookup issues with StringLiteral and NumericLiteral method property name? "if"(x) or 123() are syntactically valid call expressions so we have to see the { after the ) before we know it is an object literal and not a block lambda.
Lookahead issues, yes.
Property value short hand is also a problem:
let x=2; let ambig3 = {x}; //is this an object literal equivalent to {x:x} or a 0 argument block lambda that returns the value of x?
That one is noted already in arrow function syntax.
I don't see a easy way to resolve this one other than dropping property value shorthand. I'd be willing to sacrifice them in order to get block lambdas
Block lambdas as proposed don't have any ambiguities because they require empty || parameter lists for the zero-parameter case.
I don't think the ! and ~ property attribute prefixes cause any ambiguity issues:
let obj3 = {~x: 5}; let obj4= (!z(a,b,c,d) {}};
however, it does further increase the lookahead necessary be decide between an initial method property or an initial expression statement. This might be enough to reexamine the use of these attribute prefixes.
The lookahead requirement is a problem for LR(1) validation, which I'm skeptical we can or should step away from.
I still think we need attribute control functionality in object literals but I think I would be willing to give up on them for the method property shorthand if that would help. Essentially say that you have to use an old-fashioned property : property assignment if you want to specify non-default method attributes:
let obj4={!z: function(a,b,c,d) {}};
Could the attribute punctuators go after the property name instead of before?
On Jun 23, 2011, at 3:27 PM, Brendan Eich wrote:
Apologies for not crediting Breton Slivka, who suggested working this approach here:
esdiscuss/2011-June/014933
Here's an attempt to formalize the unification grammatically.
Block: { UnlabeledStatementFirstList } { WellLabeledStatement StatementList? }
UnlabeledStatementFirstList: UnlabeledStatement UnlabeledStatementFirstList Statement
Statement: UnlabeledStatement LabeledStatement
UnlabeledStatement: Block VariableStatement EmptyStatement ExpressionStatement ContinueStatement ReturnStatement LabelUsingStatement DebuggerStatement
LabelUsingStatement: IfStatement IterationStatement BreakStatement WithStatement SwitchStatement ThrowStatement TryStatement
LabeledStatement: Identifier : Statement
WellLabeledStatement: Identifier : LabelUsingStatement Identifier : WellLabeledStatement
(I'm using the American spelling of "labeled" and "unlabeled". Can't help myself!)
Notice the right recursion in WellLabeledStatement's rule.
The idea is to allow
and other such "not-well-labeled statements" for backward compatibility, but only at top level in SourceElement context. Not after a { that starts a Block.
After a { that starts a Block, you can have a label only if it is followed by a statement that could possibly use that label (labels may nest in such a WellLabeledStatement).
Any expression after a label that follows a { therefore must be the value part of a PropertyNameAndValueList in an ObjectLiteral.
This is a mostly-compatible change. Again props to crock for suggesting restrictions to label usage as a spark that kindled this fire.
If I have this right, then we could add a new production:
PrimaryExpression: Block
to allow blocks as expressions. We'd still need the [lookahead ∉ {{, function}] restriction in ExpressionStatement.
Making blocks be expressions allows us to treat them as zero-parameter block-lambdas: ({statements}) instead of the ugly ({|| statements}). The semantics would be the same as with a block-lambda: evaluation of the Block is deferred until it is called, typeof says "function", reformed completion value is implicit return value, etc. See:
strawman:block_lambda_revival
(I haven't unified the above with the block lambda revival grammar yet; one step at a time.)
Grammar nerds, please validate!