Proposal: paren-free function calls and definitions

# Claus Reinke (14 years ago)

Dear all,

following the recent suggestion here that github might be a more suitable forum for kick-starting proposals than this list, I've put together a proposal draft on simple paren-free function calls and definitions:

clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:

tailnests.txt: proposal text

tailnests-examples.js: some small illustrative examples

tailnests.html: if you clone the jstr repo, this is a simplistic
    interface to a prototype implementation (one textarea
    for source input, one textarea for desugared output)

../es5.js: near ES5 grammar, with proposed extensions
    (look for rules guardes with LANGUAGE.tailnests)

The proposal concerns the old topic of removing some syntactic obstacles from Javascript's functional core.

Over the years, this list has seen numerous more or less ambitious proposals and discussions on this topic, and every single one of them seems to have got stuck at some point.

Therefore, this proposal deliberately focuses on the less controversial aspects of the problem. The plan is to to work out a minimal solution that helps, for instance with nested callback chains, and with monadic programming, without raising serious concerns about semantic changes. In other words, the goal is to make progress, however minimal.

The base proposal focuses on definitions of simple expression- returning functions and calls with primitive or function expression arguments. Paren-free left-associative function application is done as a simple generalization of existing grammar, while paren-free right-associative function application needs an infix operator. All grammar changes are localized, new rules are controlled by small lookahead, and ASI changes are flagged as parser warnings.

If you find the base proposal suitable, the additional notes make two suggestions for removing the remaining syntactic issues in future language versions. Separating out the syntactic issues and leaving the controversial semantic issues for other proposals, should help to make progress on the former - if it also gets the latter unstuck, all the better.

Looking forward to your feedback, Claus

PS. I assume that proposal discussion is still going to take place here on this list.

Dear all,

following the recent suggestion here that github might be a
more suitable forum for kick-starting proposals than this list,
I've put together a proposal draft on simple paren-free function
calls and definitions:

https://github.com/clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar 
changes, semantics (entirely via desugaring to existing syntax), 
and related issues, as well as a prototype implementation, using 
jstr's nascent grammar-based ES-to-ES rewriting:

    tailnests.txt: proposal text

    tailnests-examples.js: some small illustrative examples

    tailnests.html: if you clone the jstr repo, this is a simplistic
        interface to a prototype implementation (one textarea
        for source input, one textarea for desugared output)

    ../es5.js: near ES5 grammar, with proposed extensions
        (look for rules guardes with LANGUAGE.tailnests)

The proposal concerns the old topic of removing some 
syntactic obstacles from Javascript's functional core. 

Over the years, this list has seen numerous more or less 
ambitious proposals and discussions on this topic, and every 
single one of them seems to have got stuck at some point. 

Therefore, this proposal deliberately focuses on the less 
controversial aspects of the problem. The plan is to to work 
out a minimal solution that helps, for instance with nested 
callback chains, and with monadic programming, without 
raising serious concerns about semantic changes. In other
words, the goal is to make progress, however minimal.

The base proposal focuses on definitions of simple expression-
returning functions and calls with primitive or function
expression arguments. Paren-free left-associative function 
application is done as a simple generalization of existing
grammar, while paren-free right-associative function 
application needs an infix operator. All grammar changes
are localized, new rules are controlled by small lookahead,
and ASI changes are flagged as parser warnings.

If you find the base proposal suitable, the additional notes
make two suggestions for removing the remaining syntactic
issues in future language versions. Separating out the 
syntactic issues and leaving the controversial semantic 
issues for other proposals, should help to make progress 
on the former - if it also gets the latter unstuck, all the better.

Looking forward to your feedback,
Claus

PS. I assume that proposal discussion is still going to take
    place here on this list.

# Brendan Eich (14 years ago)

On Jun 19, 2011, at 3:17 PM, Claus Reinke wrote:

clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:

tailnests.txt: proposal text

Quick reply to point out that: FunctionExpression : function Identifier_opt ( FormalParameterList_opt ) { FunctionBody }

function Identifier_opt ( FormalParameterList_opt ) => AssignmentExpression

inverts precedence, as MemberExpression : FunctionExpression but now FunctionExpression can end with a low-precedence expression (AssignmentExpression).

For example, the last line in

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

On Jun 19, 2011, at 3:17 PM, Claus Reinke wrote:

> https://github.com/clausreinke/jstr/tree/master/es-discuss
> 
> You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:
> 
>   tailnests.txt: proposal text


Quick reply to point out that:
    FunctionExpression :
      function Identifier_opt ( FormalParameterList_opt ) { FunctionBody }
+     function Identifier_opt ( FormalParameterList_opt ) => AssignmentExpression

inverts precedence, as MemberExpression : FunctionExpression but now FunctionExpression can end with a low-precedence expression (AssignmentExpression).

For example, the last line in

  function f(x) => x;
  z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

/be


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110619/7c50e086/attachment.html>

# Brendan Eich (14 years ago)

On Jun 19, 2011, at 4:04 PM, Brendan Eich wrote:

On Jun 19, 2011, at 3:17 PM, Claus Reinke wrote:

clausreinke/jstr/tree/master/es-discuss

You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:

tailnests.txt: proposal text

Quick reply to point out that: FunctionExpression : function Identifier_opt ( FormalParameterList_opt ) { FunctionBody }
function Identifier_opt ( FormalParameterList_opt ) => AssignmentExpression
inverts precedence, as MemberExpression : FunctionExpression but now FunctionExpression can end with a low-precedence expression (AssignmentExpression).

For example, the last line in

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

This is not just awkward, of course -- the grammar you propose is ambiguous. The precedence inversion means there are two ways to parse

z = a + function (b) => b ? c : d;

Either as

z = (a + (function (b) => b)) ? c : d;

or as

z = a + (function (b) => (b ? c : d));

And we've been around this block before. Ambiguous grammars are future-hostile.

On Jun 19, 2011, at 4:04 PM, Brendan Eich wrote:

> On Jun 19, 2011, at 3:17 PM, Claus Reinke wrote:
> 
>> https://github.com/clausreinke/jstr/tree/master/es-discuss
>> 
>> You can find there the proposal text, with motivation, grammar changes, semantics (entirely via desugaring to existing syntax), and related issues, as well as a prototype implementation, using jstr's nascent grammar-based ES-to-ES rewriting:
>> 
>>   tailnests.txt: proposal text
> 
> 
> Quick reply to point out that:
>     FunctionExpression :
>       function Identifier_opt ( FormalParameterList_opt ) { FunctionBody }
> +     function Identifier_opt ( FormalParameterList_opt ) => AssignmentExpression
> 
> inverts precedence, as MemberExpression : FunctionExpression but now FunctionExpression can end with a low-precedence expression (AssignmentExpression).
> 
> For example, the last line in
> 
>   function f(x) => x;
>   z = a + function (b) => b ? f : x++(1);
> 
> parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

This is not just awkward, of course -- the grammar you propose is ambiguous. The precedence inversion means there are two ways to parse

  z = a + function (b) => b ? c : d;

Either as

  z = (a + (function (b) => b)) ? c : d;

or as

  z = a + (function (b) => (b ? c : d));

And we've been around this block before. Ambiguous grammars are future-hostile.

/be
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110619/71b569b2/attachment.html>

# Claus Reinke (14 years ago)

This is not just awkward, of course -- the grammar you propose is ambiguous.

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

The precedence inversion means there are two ways to parse

z = a + function (b) => b ? c : d; .. z = a + (function (b) => (b ? c : d));

This is the intended parse, and also the one used by the prototype (you can change 'opts ="pu"' to 'opts = "pua"' to get the ast). To get the other parse, one would need explicit parens.

The idea is to get the longest parse paren-free, with parens as a means to get more fine-control to limit function bodies. That does imply that things to the right of a function tend to get swallowed up in the function body.

As for your other example,

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

I'm not following yet - what parse would you like to see instead of this one? If you wanted the function body to end earlier, you can get that by adding parens, right?

Claus

> This is not just awkward, of course -- the grammar you propose is
> ambiguous.

No ambiguity intended - function bodies extend as far to the right
as possible (see Note 1).

> The precedence inversion means there are two ways to parse
>
>  z = a + function (b) => b ? c : d;
>..
>  z = a + (function (b) => (b ? c : d));

This is the intended parse, and also the one used by the prototype
(you can change 'opts ="pu"' to 'opts = "pua"' to get the ast). To
get the other parse, one would need explicit parens.

The idea is to get the longest parse paren-free, with parens as a
means to get more fine-control to limit function bodies. That does
imply that things to the right of a function tend to get swallowed
up in the function body.

As for your other example,

>  function f(x) => x;
>  z = a + function (b) => b ? f : x++(1);
>
>parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) 
>(1) ) ).

I'm not following yet - what parse would you like to see instead
of this one? If you wanted the function body to end earlier, you
can get that by adding parens, right?

Claus

# Brendan Eich (14 years ago)

On Jun 20, 2011, at 1:49 AM, Claus Reinke wrote:

This is not just awkward, of course -- the grammar you propose is ambiguous.

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

ECMA-262 needs an unambiguous formal grammar in which there is always only one way to parse. That grammar should be validated mechanically -- no Notes apply, just LR(1).

Otherwise as Waldemar has pointed out on this list several times, you can get into trouble with "negative match" rules as the language evolves: you add something that disambiguates differently and changes the meaning of existing code -- which means you cannot add that something. It can be very hard to see the effects of a change; the problem is not local to a production or set of productions.

This is the future-hostility problem of ambiguous grammars.

The precedence inversion means there are two ways to parse

z = a + function (b) => b ? c : d; .. z = a + (function (b) => (b ? c : d));

This is the intended parse, and also the one used by the prototype (you can change 'opts ="pu"' to 'opts = "pua"' to get the ast). To get the other parse, one would need explicit parens.

Yes, I know. BTW, SpiderMonkey and Rhino (I believe) implement "expression closures", the same thing you're proposing but without the => (it's not necessary). They have the same ambiguity problem.

function f(x) => x; z = a + function (b) => b ? f : x++(1);

parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).

I'm not following yet - what parse would you like to see instead of this one? If you wanted the function body to end earlier, you can get that by adding parens, right?

The awkwardness here is the lack of mandatory parentheses around the entire function expression, ending before the application via (1).

I've factored the grammar differently in strawman:block_lambda_revival to avoid precedence inversion, and I've done some preliminary validation work to make sure there are no ambiguities. See the InitialValue nonterminal and the new assignment statement forms.

Another approach, used in strawman:arrow_function_syntax -- put the new function expression form at AssignmentExpression level in the grammar, do not make it a MemberExpression. This does not invert precedence.

On Jun 20, 2011, at 1:49 AM, Claus Reinke wrote:

>> This is not just awkward, of course -- the grammar you propose is
>> ambiguous.
> 
> No ambiguity intended - function bodies extend as far to the right
> as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

ECMA-262 needs an unambiguous formal grammar in which there is always only one way to parse. That grammar should be validated mechanically -- no Notes apply, just LR(1).

Otherwise as Waldemar has pointed out on this list several times, you can get into trouble with "negative match" rules as the language evolves: you add something that disambiguates differently and changes the meaning of existing code -- which means you cannot add that something. It can be very hard to see the effects of a change; the problem is not local to a production or set of productions.

This is the future-hostility problem of ambiguous grammars.

>> The precedence inversion means there are two ways to parse
>> 
>> z = a + function (b) => b ? c : d;
>> ..
>> z = a + (function (b) => (b ? c : d));
> 
> This is the intended parse, and also the one used by the prototype
> (you can change 'opts ="pu"' to 'opts = "pua"' to get the ast). To
> get the other parse, one would need explicit parens.

Yes, I know. BTW, SpiderMonkey and Rhino (I believe) implement "expression closures", the same thing you're proposing but without the => (it's not necessary). They have the same ambiguity problem.

>> function f(x) => x;
>> z = a + function (b) => b ? f : x++(1);
>> 
>> parses awkwardly, as ( z = ( a + ( function (b) => ( b ? f : ( x ++ ) ) ) (1) ) ).
> 
> I'm not following yet - what parse would you like to see instead
> of this one? If you wanted the function body to end earlier, you
> can get that by adding parens, right?

The awkwardness here is the lack of mandatory parentheses around the entire function expression, ending before the application via (1).

I've factored the grammar differently in http://wiki.ecmascript.org/doku.php?id=strawman:block_lambda_revival to avoid precedence inversion, and I've done some preliminary validation work to make sure there are no ambiguities. See the InitialValue nonterminal and the new assignment statement forms.

Another approach, used in http://wiki.ecmascript.org/doku.php?id=strawman:arrow_function_syntax -- put the new function expression form at AssignmentExpression level in the grammar, do not make it a MemberExpression. This does not invert precedence.

/be
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110620/1eb40866/attachment.html>

# Claus Reinke (14 years ago)

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

Nitpick: I used ambiguity resolution mechanisms (notes in the text and peg non-backtracking choice in the implementation) that are not suitable for the ES spec.

>> No ambiguity intended - function bodies extend as far to the right
>> as possible (see Note 1).
>
> It doesn't matter what you intend. There are multiple ways to parse,
> at many precedence levels, due to the precedence inversion. The
> grammar is ambiguous.

Nitpick: I used ambiguity resolution mechanisms (notes in the text
and peg non-backtracking choice in the implementation) that are
not suitable for the ES spec.

> Another approach, used in 
> http://wiki.ecmascript.org/doku.php?id=strawman:arrow_function_syntax -- 
> put the new function expression form at AssignmentExpression
> level in the grammar, do not make it a MemberExpression. This
> does not invert precedence.

Thanks - I can't believe I missed this! Obvious, in hindsight: low
priority infix ops should go at the appropriate level in ES' explicitly
unfolded priority tower. I guess I was thinking of establishing an
implicit set of parens, hence different priority level..

Anyway, I will rewrite proposal and prototype, after having
another look and think.

> ECMA-262 needs an unambiguous formal grammar in which
> there is always only one way to parse. That grammar should
> be validated mechanically -- no Notes apply, just LR(1).

Btw, does the spec state that the grammar is supposed to be
LR(1)? I missed that detail when reading, only got it from this list.

> Otherwise as Waldemar has pointed out on this list several times,
> you can get into trouble with "negative match" rules as the
> language evolves: you add something that disambiguates
> differently and changes the meaning of existing code -- which
> means you cannot add that something. It can be very hard to
> see the effects of a change; the problem is not local to a production
> or set of productions.

Yes, I tend to make the fixed lookahead explicit, so that ordered
choice makes no difference (positive guards instead of negative
match; also good for efficiency and parse error messages). Don't
know why I fell into the rely-on-peg trap here.

While we're on the topic: isn't the ASI spec a big counter-example
to this recommended practice? I've yet to see an ASI-supporting
Javascript parser that doesn't try to translate the negative-match
ASI spec ("if an offending token is not allowed by any production")
into a constructive description ("if a semicolon would help here").

Or a proof that these in-grammar renderings are equivalent to
the ASI spec (I much prefer the positive, in-grammar rendering,
but it does not follow the spec text).

Thanks for the constructive comments,
Claus

# Brendan Eich (14 years ago)

On Jun 20, 2011, at 11:19 AM, Claus Reinke wrote:

No ambiguity intended - function bodies extend as far to the right as possible (see Note 1).

It doesn't matter what you intend. There are multiple ways to parse, at many precedence levels, due to the precedence inversion. The grammar is ambiguous.

Nitpick: I used ambiguity resolution mechanisms (notes in the text and peg non-backtracking choice in the implementation) that are not suitable for the ES spec.

I know, you've been clear about this. But the problem remains how ever to specify something like what you want in the standard, unless you do move the expression-body form to AssignmentExpression level.

ECMA-262 needs an unambiguous formal grammar in which there is always only one way to parse. That grammar should be validated mechanically -- no Notes apply, just LR(1).

Btw, does the spec state that the grammar is supposed to be LR(1)? I missed that detail when reading, only got it from this list.

I'm not sure why the spec does not contain the string "LR(1)" (according to Adobe Reader). Waldemar may know the history.

While we're on the topic: isn't the ASI spec a big counter-example to this recommended practice? I've yet to see an ASI-supporting Javascript parser that doesn't try to translate the negative-match ASI spec ("if an offending token is not allowed by any production") into a constructive description ("if a semicolon would help here").

ASI is an error correction algorithm, plus the restricted productions. Yes, it has the negative match hazard. Big time. The point is: no more; ASI, we are stuck with, but do not add more like it.

On Jun 20, 2011, at 11:19 AM, Claus Reinke wrote:

>>> No ambiguity intended - function bodies extend as far to the right
>>> as possible (see Note 1).
>> 
>> It doesn't matter what you intend. There are multiple ways to parse,
>> at many precedence levels, due to the precedence inversion. The
>> grammar is ambiguous.
> 
> Nitpick: I used ambiguity resolution mechanisms (notes in the text
> and peg non-backtracking choice in the implementation) that are
> not suitable for the ES spec.

I know, you've been clear about this. But the problem remains how ever to specify something like what you want in the standard, unless you do move the expression-body form to AssignmentExpression level.

>> ECMA-262 needs an unambiguous formal grammar in which
>> there is always only one way to parse. That grammar should
>> be validated mechanically -- no Notes apply, just LR(1).
> 
> Btw, does the spec state that the grammar is supposed to be
> LR(1)? I missed that detail when reading, only got it from this list.

I'm not sure why the spec does not contain the string "LR(1)" (according to Adobe Reader). Waldemar may know the history.

> While we're on the topic: isn't the ASI spec a big counter-example
> to this recommended practice? I've yet to see an ASI-supporting
> Javascript parser that doesn't try to translate the negative-match
> ASI spec ("if an offending token is not allowed by any production")
> into a constructive description ("if a semicolon would help here").

ASI is an error correction algorithm, plus the restricted productions. Yes, it has the negative match hazard. Big time. The point is: no more; ASI, we are stuck with, but do not add more like it.

/be