Proposal: paren-free function calls and definitions

# Claus Reinke (14 years ago)

Dear all,

following the recent suggestion here that github might be a more suitable forum for kick-starting proposals than this list, I've put together a proposal draft on simple paren-free function calls and definitions:

clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:

tailnests.txt: proposal text

tailnests-examples.js: some small illustrative examples

tailnests.html: if you clone the jstr repo, this is a simplistic
    interface to a prototype implementation (one textarea
    for source input, one textarea for desugared output)

../es5.js: near ES5 grammar, with proposed extensions
    (look for rules guardes with LANGUAGE.tailnests)

The proposal concerns the old topic of removing some syntactic obstacles from Javascript's functional core.

Over the years, this list has seen numerous more or less ambitious proposals and discussions on this topic, and every single one of them seems to have got stuck at some point.

Therefore, this proposal deliberately focuses on the less controversial aspects of the problem. The plan is to to work out a minimal solution that helps, for instance with nested callback chains, and with monadic programming, without raising serious concerns about semantic changes. In other words, the goal is to make progress, however minimal.

The base proposal focuses on definitions of simple expression- returning functions and calls with primitive or function expression arguments. Paren-free left-associative function application is done as a simple generalization of existing grammar, while paren-free right-associative function application needs an infix operator. All grammar changes are localized, new rules are controlled by small lookahead, and ASI changes are flagged as parser warnings.

If you find the base proposal suitable, the additional notes make two suggestions for removing the remaining syntactic issues in future language versions. Separating out the syntactic issues and leaving the controversial semantic issues for other proposals, should help to make progress on the former - if it also gets the latter unstuck, all the better.

Looking forward to your feedback, Claus

PS. I assume that proposal discussion is still going to take place here on this list.

# Brendan Eich (14 years ago)

On Jun 19, 2011, at 3:17 PM, Claus Reinke wrote:

clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:

tailnests.txt: proposal text

Quick reply to point out that: FunctionExpression : function Identifier_opt ( FormalParameterList_opt ) { FunctionBody }

  • function Identifier_opt ( FormalParameterList_opt ) => AssignmentExpression
    

inverts precedence, as MemberExpression : FunctionExpression but now FunctionExpression can end with a low-precedence expression (AssignmentExpression).

For example, the last line in

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

# Brendan Eich (14 years ago)

On Jun 19, 2011, at 4:04 PM, Brendan Eich wrote:

On Jun 19, 2011, at 3:17 PM, Claus Reinke wrote:

clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:

tailnests.txt: proposal text

Quick reply to point out that: FunctionExpression : function Identifier_opt ( FormalParameterList_opt ) { FunctionBody }

  • function Identifier_opt ( FormalParameterList_opt ) => AssignmentExpression
    

inverts precedence, as MemberExpression : FunctionExpression but now FunctionExpression can end with a low-precedence expression (AssignmentExpression).

For example, the last line in

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

This is not just awkward, of course -- the grammar you propose is ambiguous. The precedence inversion means there are two ways to parse

z = a + function (b) => b ? c : d;

Either as

z = (a + (function (b) => b)) ? c : d;

or as

z = a + (function (b) => (b ? c : d));

And we've been around this block before. Ambiguous grammars are future-hostile.

# Claus Reinke (14 years ago)

This is not just awkward, of course -- the grammar you propose is ambiguous.

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

The precedence inversion means there are two ways to parse

z = a + function (b) => b ? c : d; .. z = a + (function (b) => (b ? c : d));

This is the intended parse, and also the one used by the prototype (you can change 'opts ="pu"' to 'opts = "pua"' to get the ast). To get the other parse, one would need explicit parens.

The idea is to get the longest parse paren-free, with parens as a means to get more fine-control to limit function bodies. That does imply that things to the right of a function tend to get swallowed up in the function body.

As for your other example,

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

I'm not following yet - what parse would you like to see instead of this one? If you wanted the function body to end earlier, you can get that by adding parens, right?

Claus

# Brendan Eich (14 years ago)

On Jun 20, 2011, at 1:49 AM, Claus Reinke wrote:

This is not just awkward, of course -- the grammar you propose is ambiguous.

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

ECMA-262 needs an unambiguous formal grammar in which there is always only one way to parse. That grammar should be validated mechanically -- no Notes apply, just LR(1).

Otherwise as Waldemar has pointed out on this list several times, you can get into trouble with "negative match" rules as the language evolves: you add something that disambiguates differently and changes the meaning of existing code -- which means you cannot add that something. It can be very hard to see the effects of a change; the problem is not local to a production or set of productions.

This is the future-hostility problem of ambiguous grammars.

The precedence inversion means there are two ways to parse

z = a + function (b) => b ? c : d; .. z = a + (function (b) => (b ? c : d));

This is the intended parse, and also the one used by the prototype (you can change 'opts ="pu"' to 'opts = "pua"' to get the ast). To get the other parse, one would need explicit parens.

Yes, I know. BTW, SpiderMonkey and Rhino (I believe) implement "expression closures", the same thing you're proposing but without the => (it's not necessary). They have the same ambiguity problem.

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

I'm not following yet - what parse would you like to see instead of this one? If you wanted the function body to end earlier, you can get that by adding parens, right?

The awkwardness here is the lack of mandatory parentheses around the entire function expression, ending before the application via (1).

I've factored the grammar differently in strawman:block_lambda_revival to avoid precedence inversion, and I've done some preliminary validation work to make sure there are no ambiguities. See the InitialValue nonterminal and the new assignment statement forms.

Another approach, used in strawman:arrow_function_syntax -- put the new function expression form at AssignmentExpression level in the grammar, do not make it a MemberExpression. This does not invert precedence.

# Claus Reinke (14 years ago)

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

Nitpick: I used ambiguity resolution mechanisms (notes in the text and peg non-backtracking choice in the implementation) that are not suitable for the ES spec.

# Brendan Eich (14 years ago)

On Jun 20, 2011, at 11:19 AM, Claus Reinke wrote:

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

Nitpick: I used ambiguity resolution mechanisms (notes in the text and peg non-backtracking choice in the implementation) that are not suitable for the ES spec.

I know, you've been clear about this. But the problem remains how ever to specify something like what you want in the standard, unless you do move the expression-body form to AssignmentExpression level.

ECMA-262 needs an unambiguous formal grammar in which there is always only one way to parse. That grammar should be validated mechanically -- no Notes apply, just LR(1).

Btw, does the spec state that the grammar is supposed to be LR(1)? I missed that detail when reading, only got it from this list.

I'm not sure why the spec does not contain the string "LR(1)" (according to Adobe Reader). Waldemar may know the history.

While we're on the topic: isn't the ASI spec a big counter-example to this recommended practice? I've yet to see an ASI-supporting Javascript parser that doesn't try to translate the negative-match ASI spec ("if an offending token is not allowed by any production") into a constructive description ("if a semicolon would help here").

ASI is an error correction algorithm, plus the restricted productions. Yes, it has the negative match hazard. Big time. The point is: no more; ASI, we are stuck with, but do not add more like it.