statements that could be expressions?
2011/6/1 Peter Michaux <petermichaux at gmail.com>:
Could some of JavaScript's statements also be allowed as expressions?
In Perl there is the common idiom when opening a file
open F, "< $f" or die "Can't open $f : $!";
In JavaScript could "throw" be an expression?
f() || throw 'f failed';
Could JavaScript's "if" become an expression? (I know JavaScript the ?: operator but this is just a for example.)
Could a Block statement also be an expression like Scheme's "begin"?
The semantics of all of these are specified in terms of expression semantics. Every statement or expression is specified in terms of a triple of (one of (normal, throw, break, continue), value, label). This is how eval can take a Program production and return a value or propagate an exception. So it could be done without changing the semantics of many statements.
With existing code, it can already happen in some cases.
A block of statements that can be translated to expressions can already be turned into an expression using the comma operator.
conditionals can translate to && and ?: operators.
Throw and return can currently be factored left so { foo(); if (bar()) baz(); return boo(); } is equivalent to return (foo(), bar() && baz(); boo()); and a number of code minifiers take advantage of this
Throw statements whose exception is reliably an Error instance can be factored to an expression via library code Error.prototype.raise = function () { throw this; }; or a well known global function that can throw any value.
Loops whose initializers and bodies can be translated to expressions can be translated to expressions using eval(...).
That basically leaves try/catch, break, return, continue, labeled loops or loops with declarations, and declarations themselves as the sticky points.
On Jun 1, 2011, at 5:16 PM, Mike Samuel wrote:
The semantics of all of these are specified in terms of expression semantics. Every statement or expression is specified in terms of a triple of (one of (normal, throw, break, continue), value, label).
Alas, only statements have Completion results. Expressions are modeled as having normal completion with a value, and there's a bit of a gap in the spec where a thrown exception unwinds from within an expression to the statement layer.
For a deferred proposal to allow statements to be turned into expresisons explicitly, see
Implicitly treating statements in expression context as expressions could be done but I think it smells bad on account of turning erronous programs into working programs, and possibly as a WTF teaching moment. These may not be fatal objections, though. TC39 has not really considered making statements into expressions.
See also
(note: in Harmony for ES.next).
On Wed, Jun 1, 2011 at 5:22 PM, Brendan Eich <brendan at mozilla.com> wrote:
Implicitly treating statements in expression context as expressions could be done but I think it smells bad on account of turning erronous programs into working programs,
Such programs throw syntax errors when interpreted so presumably not many exist where the program is of value.
and possibly as a WTF teaching moment.
I can imagine this would be a mental change but ...
These may not be fatal objections, though. TC39 has not really considered making statements into expressions.
See also
Completion reform is exactly what made me ask about this. If expressions already become values and statements all have completion values, could JavaScript more or less do away with statements and just have expressions?
Peter
On Jun 1, 2011, at 5:22 PM, Brendan Eich wrote:
On Jun 1, 2011, at 5:16 PM, Mike Samuel wrote:
The semantics of all of these are specified in terms of expression semantics. Every statement or expression is specified in terms of a triple of (one of (normal, throw, break, continue), value, label).
Alas, only statements have Completion results. Expressions are modeled as having normal completion with a value, and there's a bit of a gap in the spec where a thrown exception unwinds from within an expression to the statement layer.
We probably could rework the spec. so that expression elements also evaluated to Completions and that consumers of them had to explicitly pull out the value. That would make it possible to have a better spec. for exceptions in expressions. However, I'm not sure it's worth the spec. work.
On 06/01/11 16:07, Peter Michaux wrote:
Could some of JavaScript's statements also be allowed as expressions?
In Perl there is the common idiom when opening a file
open F, "< $f" or die "Can't open $f : $!";
In JavaScript could "throw" be an expression?
f() || throw 'f failed';
Could JavaScript's "if" become an expression? (I know JavaScript the ?: operator but this is just a for example.)
Could a Block statement also be an expression like Scheme's "begin"?
Peter
Theoretically it's possible to allow statements as expressions, but you run into a number of practical problems:
- Precedence inversion:
a = if (b) c
So far so good. However, the substatement of an if can also be a comma expression, which causes trouble:
a = if (b) c, d
Also, what should be stored in a if b is false? You can argue between false, null, and undefined.
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
- Semicolon insertion:
return if (longcondition) { ... } else { ... }
will not do what you want. The entire if is dead code.
Waldemar
2011/6/1 Waldemar Horwat <waldemar at google.com>:
On 06/01/11 16:07, Peter Michaux wrote:
Could some of JavaScript's statements also be allowed as expressions?
In Perl there is the common idiom when opening a file
open F, "< $f" or die "Can't open $f : $!";
In JavaScript could "throw" be an expression?
f() || throw 'f failed';
Could JavaScript's "if" become an expression? (I know JavaScript the ?: operator but this is just a for example.)
Could a Block statement also be an expression like Scheme's "begin"?
Peter
Theoretically it's possible to allow statements as expressions, but you run into a number of practical problems:
- Precedence inversion:
a = if (b) c
So far so good. However, the substatement of an if can also be a comma expression, which causes trouble:
a = if (b) c, d
Also, what should be stored in a if b is false? You can argue between false, null, and undefined.
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
- Semicolon insertion:
return if (longcondition) { ... } else { ... }
will not do what you want. The entire if is dead code.
If the goal is to allow some subset (such as the following) of statement types to be embedded in expressions: IfStatement IterationStatement SwitchStatement ThrowStatement ReturnStatement BreakStatement ContinueStatement TryStatement DebuggerStatement
then I think it can be done without to much syntactic trouble.
Modify PrimaryExpression to add the following productions "(" IfStatement ")" | "(" SwitchStatement ")" | ...
Basically an uncontextually reserved keyword that always starts a statement following an open parenthesis is treated as a statement embedded in an expression. And a semicolon is neither required nor allowed before the closing ")".
This would require changing expression semantics to be specified in the same way as statement semantics. Specifically, labels and exception would have to be propagated out of all expression computations early.
Semicolon insertion should continue to work unchanged.
On 06/01/11 18:01, Peter Michaux wrote:
On Wed, Jun 1, 2011 at 5:52 PM, Waldemar Horwat<waldemar at google.com> wrote:
On 06/01/11 16:07, Peter Michaux wrote:
Could some of JavaScript's statements also be allowed as expressions?
[snip]
Theoretically it's possible to allow statements as expressions, but you run into a number of practical problems:
- Precedence inversion:
a = if (b) c
So far so good. However, the substatement of an if can also be a comma expression, which causes trouble:
a = if (b) c, d
Also, what should be stored in a if b is false? You can argue between false, null, and undefined.
Indeed.
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
Ouch.
- Semicolon insertion:
return if (longcondition) { ... } else { ... }
will not do what you want. The entire if is dead code.
This one is already an issue for JavaScript programmers.
--
Ok so perhaps only some statements could be converted to expressions easily without ambiguity. Would a throw expression be possible without much grief, for example? I think it could be useful (since I wanted to do it exactly two days ago hence my post here.)
You'd have to deal with the same precedence inversion as in the if statement example above.
If you make <throw expr> bind the loosest, then there is no inversion, but you can't write things like:
a = b ? c : throw d;
because you made ?: bind tighter than throw.
If you make <throw expr> bind tighter than, say, ?:, then you run into an ambiguity with = or , operators:
a = b ? c : throw d, e;
This can parse as either:
a = b ? c : throw (d, e);
or:
(a = b ? c : throw d), e;
Waldemar
On 06/01/11 18:04, Mike Samuel wrote:
2011/6/1 Waldemar Horwat<waldemar at google.com>:
On 06/01/11 16:07, Peter Michaux wrote:
Could some of JavaScript's statements also be allowed as expressions?
In Perl there is the common idiom when opening a file
open F, "< $f" or die "Can't open $f : $!";
In JavaScript could "throw" be an expression?
f() || throw 'f failed';
Could JavaScript's "if" become an expression? (I know JavaScript the ?: operator but this is just a for example.)
Could a Block statement also be an expression like Scheme's "begin"?
Peter
Theoretically it's possible to allow statements as expressions, but you run into a number of practical problems:
- Precedence inversion:
a = if (b) c
So far so good. However, the substatement of an if can also be a comma expression, which causes trouble:
a = if (b) c, d
Also, what should be stored in a if b is false? You can argue between false, null, and undefined.
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
- Semicolon insertion:
return if (longcondition) { ... } else { ... }
will not do what you want. The entire if is dead code.
If the goal is to allow some subset (such as the following) of statement types to be embedded in expressions: IfStatement IterationStatement SwitchStatement ThrowStatement ReturnStatement BreakStatement ContinueStatement TryStatement DebuggerStatement
then I think it can be done without to much syntactic trouble.
Modify PrimaryExpression to add the following productions "(" IfStatement ")" | "(" SwitchStatement ")" | ...
Basically an uncontextually reserved keyword that always starts a statement following an open parenthesis is treated as a statement embedded in an expression. And a semicolon is neither required nor allowed before the closing ")".
This would require changing expression semantics to be specified in the same way as statement semantics. Specifically, labels and exception would have to be propagated out of all expression computations early.
Semicolon insertion should continue to work unchanged.
Yes, if you make it mandatory to parenthesize statements then this would work, except for the important case of blocks.
Waldemar
Yes, if you make it mandatory to parenthesize statements then this would work, except for the important case of blocks.
Waldemar
This might be a pretty radical (or stupid) thing to ask, but what if a block with labeled statements were semantically the same as an object with expression lambdas, or completion values assigned to its keys? Then perhaps the syntactic conflict wouldn't be a conflict at all, and a break [label] would be a call of the lambda at [parentblockobject][label]
2011/6/1 Waldemar Horwat <waldemar at google.com>:
Yes, if you make it mandatory to parenthesize statements then this would work, except for the important case of blocks.
I agree that blocks are sticky but important.
The approach below is a way to handle blocks. It's not particularly pretty, but I don't think it's syntactically ambiguous.
Add to existing primary expression the following production
"(" (lookahead in [break, continue, do, for, if, return, switch,
throw, try, while]) ExpressionBlock ")"
Define the following Expression productions
ExpressionBlock ::== EmbeddedStatement ( ";" EmbeddedStatement )*
EmbeddedStatement ::== ExpressionStatement
| BreakStatement
| ContinueStatement
| IfStatement
| IteratorStatement
| ReturnStatement
| SwitchStatement
| ThrowStatement
| TryStatement
A block can be a series of semicolon separated (not terminated) statements inside parentheses. This introduces no additional hanging else risk.
Note that labelled statement is not in that list so an EcmaScript program can still be treated by editor paren matchers and jump forward/back macros as a tree of nested parenthetical and curly bracket regions. Within a group at the same level of nesting, semicolons define broad divisions, commas narrower divisions, and colons narrower still.
2011/6/1 Mike Samuel <mikesamuel at gmail.com>:
"(" (lookahead in [break, continue, do, for, if, return, switch, throw, try, while]) ExpressionBlock ")"
I'm so busy buggering up grammars that I keep forgetting "debugger" but it should show here and in the EmbeddedStatement production.
On 06/01/11 18:47, Breton Slivka wrote:
Yes, if you make it mandatory to parenthesize statements then this would work, except for the important case of blocks.
Waldemar
This might be a pretty radical (or stupid) thing to ask, but what if a block with labeled statements were semantically the same as an object with expression lambdas, or completion values assigned to its keys? Then perhaps the syntactic conflict wouldn't be a conflict at all, and a break [label] would be a call of the lambda at [parentblockobject][label] .
In your proposal what are the values of the following expressions?
({x: y})
({})
({x})
({x, y})
({x; y})
Waldemar
Did you mean to disallow an expression as the first statement in your "block"?
Waldemar
Yes. That grammar is a subset of the grammar that results from replacing the 11.1.6 ' PrimaryExpression : "(" Expression ")" ' production with
"(" GroupElement GroupElements ")"
where GroupElement is defined as any Statement except for Block and EmptyStatement without a terminal semicolon, and GroupElements is defined thus
GroupElements : empty
| ";" GroupElement GroupElements
but I thought it was easier to reason about ambiguity and locality of spec changes for that subset.
2011/6/2 Waldemar Horwat <waldemar at google.com>:
On 06/02/11 13:43, Mike Samuel wrote:
Yes. That grammar is a subset of the grammar that results from replacing the 11.1.6 ' PrimaryExpression : "(" Expression ")" ' production with
"(" GroupElement GroupElements ")"
where GroupElement is defined as any Statement except for Block and EmptyStatement without a terminal semicolon, and GroupElements is defined thus
GroupElements : empty | ";" GroupElement GroupElements
but I thought it was easier to reason about ambiguity and locality of spec changes for that subset.
2011/6/2 Waldemar Horwat<waldemar at google.com>:
Did you mean to disallow an expression as the first statement in your "block"?
There's no reason to disallow expression statements from the beginning, and it's really odd that you can write
(if (a) b; x = 2; c)
but not:
(x = 2; if (a) b; c)
There are other issues, though, with expression statements that begin with the word "function", but those happen regardless of whether the expression statement is first. For those, are you declaring a named function (in what scope?) or returning a function value?
Waldemar
2011/6/2 Waldemar Horwat <waldemar at google.com>:
On 06/02/11 13:43, Mike Samuel wrote:
Yes. That grammar is a subset of the grammar that results from replacing the 11.1.6 ' PrimaryExpression : "(" Expression ")" ' production with
"(" GroupElement GroupElements ")"
where GroupElement is defined as any Statement except for Block and EmptyStatement without a terminal semicolon, and GroupElements is defined thus
GroupElements : empty | ";" GroupElement GroupElements
but I thought it was easier to reason about ambiguity and locality of spec changes for that subset.
2011/6/2 Waldemar Horwat<waldemar at google.com>:
Did you mean to disallow an expression as the first statement in your "block"?
There's no reason to disallow expression statements from the beginning, and it's really odd that you can write
(if (a) b; x = 2; c)
but not:
(x = 2; if (a) b; c)
There are other issues, though, with expression statements that begin with the word "function", but those happen regardless of whether the expression statement is first. For those, are you declaring a named function (in what scope?) or returning a function value?
Quite right. I knew I had left out all declarations in my first pass.
2011/6/2 Mike Samuel <mikesamuel at gmail.com>:
2011/6/2 Waldemar Horwat <waldemar at google.com>:
On 06/02/11 13:43, Mike Samuel wrote:
Yes. That grammar is a subset of the grammar that results from replacing the 11.1.6 ' PrimaryExpression : "(" Expression ")" ' production with
"(" GroupElement GroupElements ")"
where GroupElement is defined as any Statement except for Block and EmptyStatement without a terminal semicolon, and GroupElements is defined thus
GroupElements : empty | ";" GroupElement GroupElements
but I thought it was easier to reason about ambiguity and locality of spec changes for that subset.
2011/6/2 Waldemar Horwat<waldemar at google.com>:
Did you mean to disallow an expression as the first statement in your "block"?
There's no reason to disallow expression statements from the beginning, and it's really odd that you can write
(if (a) b; x = 2; c)
but not:
(x = 2; if (a) b; c)
There are other issues, though, with expression statements that begin with the word "function", but those happen regardless of whether the expression statement is first. For those, are you declaring a named function (in what scope?) or returning a function value?
Quite right. I knew I had left out all declarations in my first pass.
Actually, there's no ambiguity. FunctionDeclaration is not part of Statement. It's part of SourceElement.
On Jun 1, 2011, at 5:52 PM, Waldemar Horwat wrote:
On 06/01/11 16:07, Peter Michaux wrote:
Could some of JavaScript's statements also be allowed as expressions?
In Perl there is the common idiom when opening a file
open F, "< $f" or die "Can't open $f : $!";
In JavaScript could "throw" be an expression?
f() || throw 'f failed';
Could JavaScript's "if" become an expression? (I know JavaScript the ?: operator but this is just a for example.)
Could a Block statement also be an expression like Scheme's "begin"?
Peter
Theoretically it's possible to allow statements as expressions, but you run into a number of practical problems:
- Precedence inversion:
a = if (b) c
So far so good. However, the substatement of an if can also be a comma expression, which causes trouble:
a = if (b) c, d
The problem here, I think, is comma expression. But commas are useful instead of semicolons for sequencing sub-expressions.
In this light, could we avoid precedence inversion and block as expression by allowing many statements as AssignmentExpression-precedence expressions?
AssignmentExpression : ... KeywordStatement
where KeywordStatement is as at strawman:paren_free.
Also, what should be stored in a if b is false? You can argue between false, null, and undefined.
harmony:completion_reform favors undefined. And that is falsy enough for the common cases.
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
strawman:arrow_function_syntax based on feedback from Jorge here on the list mentions a block that cannot start with a label. Two-token lookahead restriction, why not?
- Semicolon insertion:
return if (longcondition) { ... } else { ... }
will not do what you want. The entire if is dead code.
This is a bug with return's restricted expression-returning production anyway. We talked about fixing it with a dead-code error but that is a bit of a higher bar for implementors. The last time we talked about this was 2008, IIRC. Time to revisit?
On Jun 2, 2011, at 8:12 PM, Brendan Eich wrote:
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
strawman:arrow_function_syntax based on feedback from Jorge here on the list mentions a block that cannot start with a label. Two-token lookahead restriction, why not?
Care is required looking after Identifier to check for : making a label -- lexer must tokenize / as division operator and leave it to be consumed if instead of "{ Identifier :" (which would make an object literal in this strawman) we have "{ Identifier /".
On Fri, Jun 3, 2011 at 6:19 AM, Waldemar Horwat <waldemar at google.com> wrote:
On 06/01/11 18:47, Breton Slivka wrote:
Yes, if you make it mandatory to parenthesize statements then this would work, except for the important case of blocks.
Waldemar
This might be a pretty radical (or stupid) thing to ask, but what if a block with labeled statements were semantically the same as an object with expression lambdas, or completion values assigned to its keys? Then perhaps the syntactic conflict wouldn't be a conflict at all, and a break [label] would be a call of the lambda at [parentblockobject][label] .
In your proposal what are the values of the following expressions?
Well, to be backward compatible, blocks should be evaluated as they are read. this is no different from the values in an object. so, thinking carefully through these:
({x: y}) // an object with a key, x, that contains the lambda (->y) ({}) // the empty object/lambda ({x}) //the lambda (->x) ({x, y}) // the lambda (-> x,y) ({x; y}) // a lambda equivelent to (-> x,y)
in addition.. ({|x| xx}) // the lambda x -> xx
this would turn an object into something that is callable like a function, and assuming the duties that were normally the specialty of a block. an object with no properties can still contain an expression or list of expressions, which are callable. When an object has properties like in the first example, normal property access retrieves the "value" of the expression/lambda contained on that key- so property access with potentially lazy evaluation. the keyword "break [label]" occuring within the same object would be an explicit flush of the stack, and calling the lambda at that key- essentially a tail call.
Though I will concede now that there's likely a lot of gotchas with this idea that I haven't really thought through very clearly, like the nature of a value that is callable like a lambda.
On 06/02/11 20:12, Brendan Eich wrote:
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
strawman:arrow_function_syntax based on feedback from Jorge here on the list mentions a block that cannot start with a label. Two-token lookahead restriction, why not?
Don't you also want to allow (either now or in the future) for the shorthand {x, y} to mean {x:x, y:y}? There are also getters, setters, and various other stuff that can be put into object initializers now or in the future, so there are more possibilities than just an identifier followed by a colon.
Waldemar
On Jun 3, 2011, at 3:11 PM, Waldemar Horwat wrote:
On 06/02/11 20:12, Brendan Eich wrote:
- Conflict between blocks and statements:
a = {x: while (x) { ... break x;}}
is either an object initializer or a block containing a labeled while statement.
strawman:arrow_function_syntax based on feedback from Jorge here on the list mentions a block that cannot start with a label. Two-token lookahead restriction, why not?
Don't you also want to allow (either now or in the future) for the shorthand {x, y} to mean {x:x, y:y}? There are also getters, setters, and various other stuff that can be put into object initializers now or in the future, so there are more possibilities than just an identifier followed by a colon.
True, and {x, y} is noted somewhere (or was in a past version) as a problem. strawman:object_initialiser_shorthand did not get promoted yet, and I forgot to keep this pot boiling.
The accessors are not ambiguous, but the proposed !, ~, and # property prefixes for writable, enumerable, and configurable false-setting do make trouble.
Perhaps Breton's more radical idea of unifying objects and block (lambdas) deserves a look, instead of trying to separate syntaxes that start with { at this late date.
On Jun 3, 2011, at 4:04 PM, Brendan Eich wrote:
On Jun 3, 2011, at 3:11 PM, Waldemar Horwat wrote:
Don't you also want to allow (either now or in the future) for the shorthand {x, y} to mean {x:x, y:y}? There are also getters, setters, and various other stuff that can be put into object initializers now or in the future, so there are more possibilities than just an identifier followed by a colon.
True, and {x, y} is noted somewhere (or was in a past version) as a problem. strawman:object_initialiser_shorthand did not get promoted yet, and I forgot to keep this pot boiling.
The accessors are not ambiguous, but the proposed !, ~, and # property prefixes for writable, enumerable, and configurable false-setting do make trouble.
I updated strawman:arrow_function_syntax to restrict lookahead(2) ∉ { Identifier ":", "get" Identifier, "set" Identifier }] StatementList.
If I could demonstrate my idea working in Narcissus (or your parser of choice), would that be helpful or useful to anyone?
I was thinking about the ambiguity of {x,y} with relation to the key/value shortcut, and it seems that there's a lot of ambiguities around the { symbol that are causing some problems. My idea basically boils down to embracing the ambiguities as special cases of the same underlying semantic structure.
To put this another way, this is my thought exercise: pretend that we're scientifically observing the state of javascript today, and we decided to operate under the assumption that all structures that begin with { and end with } compile and evaluate into the same type of semantic structure (instead of multiple different kinds, function bodies, objects, blocks with labels, and what have you) what kind of structure is it that we are observing? It would seem like that as the language stands we get access to a subset of this structure's capabilities. In terms of moving forward, what kind of properties can we extrapolate from what we know about this structure type already? Looking backwards, what sort of properties does this structure have currently that would produce the behaviour we see today?
There needs to be a lot more practical work done on this idea to make it particularly compelling, but as I said, I'm willing to tinker with getting it working in an existing javascript parser/evaluator if anyone on this list thinks it is worth the time to investigate.
The latest narcissus has a few ES.next strawmen incorporated. Narcissus is very cool and easy to understand (most of it anyway), though its not auto-generated from any type of BNF grammar. Pegjs does autogenerate from a slightly annotated BNF (see javascript.pegjs under examples on github) though it does not transpile to any target backend like es5. There may be others out there that would make your life easier eg traceur or ometa. Just FYI.
Could some of JavaScript's statements also be allowed as expressions?
In Perl there is the common idiom when opening a file
open F, "< $f" or die "Can't open $f : $!";
In JavaScript could "throw" be an expression?
f() || throw 'f failed';
Could JavaScript's "if" become an expression? (I know JavaScript the ?: operator but this is just a for example.)
Could a Block statement also be an expression like Scheme's "begin"?
Peter