[syntax] arrow function notation is too "greedy"

# Andrew Fedoniouk (12 years ago)

I see 1 that arrow function syntax got specification state.

The problem with this construction: it's the only feature in ES syntax so far that requires compiler to use full AST implementation at compilation phase.

Consider this:

var val1 = 1;
var val2 = 2;
var sum1 = (val1, val2) + 1; // integer
var sum2 = (val1, val2) => val1 + val2 ; // function

var sum3 = (val1, true) ; // true

As comma is a valid operator in ES than to generate code for the expression (val1, val2) is only possible when next token past ')' is seen. That requires as I said full blown AST analysis - list in '(' ')' parenthesis shall be parsed and stored. Only next token triggers how that list will be interpreted.

So far almost all other syntax features in ES can be parsed with single token look ahead. That makes possible quite fast source-> bytecode translation schema that makes great sense for the embeddable languages like ES.

Is that arrow syntax still debatable?

Asking because Ruby-ish syntax 2 is significantly better in this respect.

And in TIScript 3 I am using similar approach but with ':' symbol:

':' <param-list> ':' <expresion>
':' <param-list> '{' <statement-list> '}'

Example, descending sorting:

arr.sort( :a,b: a - b );

Too late?

I see [1] that arrow function syntax got specification state.

The problem with this construction: it's the only feature in ES syntax
so far that
requires compiler to use full AST implementation at compilation phase.

Consider this:

var val1 = 1;
var val2 = 2;
var sum1 = (val1, val2) + 1; // integer
var sum2 = (val1, val2) => val1 + val2 ; // function
var sum3 = (val1, true) ; // true

As comma is a valid operator in ES than to generate code for the expression
(val1, val2) is only possible when next token past ')' is seen. That requires
as I said full blown AST analysis - list in '(' ')' parenthesis shall
be parsed and
stored. Only next token triggers how that list will be interpreted.

So far almost all other syntax features in ES can be parsed with
single token look ahead.
That makes possible quite fast source-> bytecode translation schema that makes
great sense for the embeddable languages like ES.

Is that arrow syntax still debatable?

Asking because Ruby-ish syntax [2] is significantly better in this respect.

And in TIScript [3] I am using similar approach but with ':' symbol:

':' <param-list> ':' <expresion>
':' <param-list> '{' <statement-list> '}'

Example, descending sorting:

 arr.sort( :a,b: a - b );

Too late?

[1] http://wiki.ecmascript.org/doku.php?id=harmony:arrow_function_syntax
[2] http://wiki.ecmascript.org/doku.php?id=strawman:block_lambda_revival
[3] http://www.codeproject.com/Articles/33662/TIScript-language-a-gentle-extension-of-JavaScript


--
Andrew Fedoniouk.

http://terrainformatica.com

# Allen Wirfs-Brock (12 years ago)

On Jul 10, 2013, at 9:39 PM, Andrew Fedoniouk wrote:

I see [1] that arrow function syntax got specification state.

so, you should probably look at the relevant parts of the draft specification

The problem with this construction: it's the only feature in ES syntax so far that

no, it also occurs between object literals and destructuring assignment object patterns.

requires compiler to use full AST implementation at compilation phase.

you don't necessarily have to built a full AST. In some cases you will need to at least do multiple parses of the same text.

Is that arrow syntax still debatable?

No really. This was known when the arrow syntax was choses.

Too late?

Yes, but primarily in the sense that such issues were already considered.

On Jul 10, 2013, at 9:39 PM, Andrew Fedoniouk wrote:

> I see [1] that arrow function syntax got specification state.

so, you should probably look at the relevant parts of the draft specification
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-5.1.4 
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-11.1 
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-13.2 

> 
> The problem with this construction: it's the only feature in ES syntax
> so far that

no, it also occurs between object literals and destructuring assignment object patterns.

> requires compiler to use full AST implementation at compilation phase.

you don't necessarily have to built a full AST.  In some cases you will need to at least do multiple parses of the same text.

> 
> Consider this:
> 
> var val1 = 1;
> var val2 = 2;
> var sum1 = (val1, val2) + 1; // integer
> var sum2 = (val1, val2) => val1 + val2 ; // function
> var sum3 = (val1, true) ; // true
> 
> As comma is a valid operator in ES than to generate code for the expression
> (val1, val2) is only possible when next token past ')' is seen. That requires
> as I said full blown AST analysis - list in '(' ')' parenthesis shall
> be parsed and
> stored. Only next token triggers how that list will be interpreted.
> 
> So far almost all other syntax features in ES can be parsed with
> single token look ahead.
> That makes possible quite fast source-> bytecode translation schema that makes
> great sense for the embeddable languages like ES.
> 
> Is that arrow syntax still debatable?

No really.  This was known when the arrow syntax was choses.

> 
> Asking because Ruby-ish syntax [2] is significantly better in this respect.
> 
> And in TIScript [3] I am using similar approach but with ':' symbol:
> 
> ':' <param-list> ':' <expresion>
> ':' <param-list> '{' <statement-list> '}'
> 
> Example, descending sorting:
> 
> arr.sort( :a,b: a - b );
> 
> Too late?

Yes, but primarily in the sense that such issues were already considered.

Allen




> 
> [1] http://wiki.ecmascript.org/doku.php?id=harmony:arrow_function_syntax
> [2] http://wiki.ecmascript.org/doku.php?id=strawman:block_lambda_revival
> [3] http://www.codeproject.com/Articles/33662/TIScript-language-a-gentle-extension-of-JavaScript
> 
> 
> --
> Andrew Fedoniouk.
> 
> http://terrainformatica.com
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130710/3b2464d1/attachment.html>

# Andrew Fedoniouk (12 years ago)

On Wed, Jul 10, 2013 at 10:46 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

no, it also occurs between object literals and destructuring assignment object patterns.

So that "fat arrow" and peculiar destructing assignment are the only features that trigger such spike in syntax complexity?

Will language survive without these two?

you don't necessarily have to built a full AST. In some cases you will need to at least do multiple parses of the same text.

Hmm... I see that while getting '(' you will need that double parsing almost always.

Is that arrow syntax still debatable?

Too pity.

Yes, but primarily in the sense that such issues were already considered.

Did we consider green gases emission increase that will happen due to that double parsing of code that currently is parsed strictly once?

( Consider that rhetoric question above as just a reminder that ES6 parser will be used for existing web code when it will deployed for pretty much each connected machine and device ).

On Wed, Jul 10, 2013 at 10:46 PM, Allen Wirfs-Brock
<allen at wirfs-brock.com> wrote:
>
> On Jul 10, 2013, at 9:39 PM, Andrew Fedoniouk wrote:
>
...
>
> The problem with this construction: it's the only feature in ES syntax
> so far that
>
>
> no, it also occurs between object literals and destructuring assignment
> object patterns.

So that "fat arrow" and peculiar destructing assignment are the only features
that trigger such spike in syntax complexity?

Will language survive without these two?

>
> requires compiler to use full AST implementation at compilation phase.
>
>
> you don't necessarily have to built a full AST.  In some cases you will need
> to at least do multiple parses of the same text.

Hmm... I see that while getting '(' you will need that double
parsing almost always.

>
> Is that arrow syntax still debatable?
>

Too pity.

>
> Yes, but primarily in the sense that such issues were already considered.
>

Did we consider green gases emission increase that will happen due to that
 double parsing of code that currently is parsed strictly once?

( Consider that rhetoric question above as just a reminder that ES6
  parser will be used for existing web code when it will deployed for
  pretty much each connected machine and device ).

--
Andrew Fedoniouk.

http://terrainformatica.com

# Brendan Eich (12 years ago)

Andrew Fedoniouk wrote:

So that "fat arrow" and peculiar destructing assignment are the only features that trigger such spike in syntax complexity?

No, JS already has (since day 1) had issues with for loops (both kinds), where any translation must reorder parts, sometimes across arbitrary sub-statement bodies.

Will language survive without these two?

They're in implementations already and being codified in ES6. I suggest you are tilting at a windmill here.

Andrew Fedoniouk wrote:
> On Wed, Jul 10, 2013 at 10:46 PM, Allen Wirfs-Brock
> <allen at wirfs-brock.com>  wrote:
>> On Jul 10, 2013, at 9:39 PM, Andrew Fedoniouk wrote:
>>
> ...
>> The problem with this construction: it's the only feature in ES syntax
>> so far that
>>
>>
>> no, it also occurs between object literals and destructuring assignment
>> object patterns.
>
> So that "fat arrow" and peculiar destructing assignment are the only features
> that trigger such spike in syntax complexity?

No, JS already has (since day 1) had issues with for loops (both kinds), 
where any translation must reorder parts, sometimes across arbitrary 
sub-statement bodies.

> Will language survive without these two?

They're in implementations already and being codified in ES6. I suggest 
you are tilting at a windmill here.

/be

# Jason Orendorff (12 years ago)

On Thu, Jul 11, 2013 at 2:00 AM, Andrew Fedoniouk <news at terrainformatica.com> wrote:

Did we consider green gases emission increase that will happen due to that double parsing of code that currently is parsed strictly once?

( Consider that rhetoric question above as just a reminder that ES6 parser will be used for existing web code when it will deployed for pretty much each connected machine and device ).

Hmm. Well, there is a lot of back and forth in those lines, so I'm not sure what to make of them. Either the new features are wasteful or they're not. Pick a position and quantify it. Let's have it out.

I think any argument from energy savings is equivalent to a performance argument, and in Firefox we would not have landed these features if they regressed any of the benchmarks we watch. Perhaps it is relevant that in Firefox, the cost is only paid when an arrow function or a destructuring assignment is actually used.

In the case of destructuring assignment, we don't rewind and do a second parse, even if it does turn out that you're doing destructuring assignment. We just have to re-interpret the AST for the left-hand side that we just parsed as a left-hand side.

Yet another situation where the meaning of an expression changes after it's parsed is introduced by default arguments, in combination with strict mode.

function f(a = function() {...}) {
    "use strict";
    return a;
}

As I understand it, the function in the default-argument-expression is strict, but our parser doesn't know it until we hit the "use strict" directive. In Firefox, this was a bigger pain for us than arrow functions or destructuring assignment. We rewind and reparse f.

Note that a similar shift in meaning occurs in ES1-5 after you've parsed x.y and then you see ( or = or ++. When we reach any of those tokens, we have to go back and check that the left hand side is valid, and this is nothing to do with any new ES6 features. (Admittedly, unlike the new features, this is something you can implement using finite lookahead. SpiderMonkey must have done it that way, once, before my time. Seems like it'd be messy.)

On Thu, Jul 11, 2013 at 2:00 AM, Andrew Fedoniouk
<news at terrainformatica.com> wrote:
> Did we consider green gases emission increase that will happen due to that
>  double parsing of code that currently is parsed strictly once?
>
> ( Consider that rhetoric question above as just a reminder that ES6
>   parser will be used for existing web code when it will deployed for
>   pretty much each connected machine and device ).

Hmm. Well, there is a lot of back and forth in those lines, so I'm not
sure what to make of them. Either the new features are wasteful or
they're not. Pick a position and quantify it. Let's have it out.

I think any argument from energy savings is equivalent to a
performance argument, and in Firefox we would not have landed these
features if they regressed any of the benchmarks we watch. Perhaps it
is relevant that in Firefox, the cost is only paid when an arrow
function or a destructuring assignment is actually used.

In the case of destructuring assignment, we don't rewind and do a
second parse, even if it *does* turn out that you're doing
destructuring assignment. We just have to re-interpret the AST for the
left-hand side that we just parsed as a left-hand side.

Yet another situation where the meaning of an expression changes after
it's parsed is introduced by default arguments, in combination with
strict mode.

    function f(a = function() {...}) {
        "use strict";
        return a;
    }

As I understand it, the function in the default-argument-expression is
strict, but our parser doesn't know it until we hit the "use strict"
directive. In Firefox, this was a bigger pain for us than arrow
functions or destructuring assignment. We rewind and reparse f.

Note that a similar shift in meaning occurs in ES1-5 after you've
parsed `x.y` and then you see `(` or `=` or `++`. When we reach any of
those tokens, we have to go back and check that the left hand side is
valid, and this is nothing to do with any new ES6 features.
(Admittedly, unlike the new features, this is something you can
implement using finite lookahead. SpiderMonkey must have done it that
way, once, before my time. Seems like it'd be messy.)

-j

# André Bargull (12 years ago)

In the case of destructuring assignment, we don't rewind and do a second parse, even if it does turn out that you're doing destructuring assignment. We just have to re-interpret the AST for the left-hand side that we just parsed as a left-hand side.

And it almost works . ;-) But no worries, it will only get worse when you add destructuring with defaults to the mix or the CoverInitialisedName production. :-/ Like this ({a} = {}) => {} or this ({a = {}} = {}) => {} ...

> On Thu, Jul 11, 2013 at 2:00 AM, Andrew Fedoniouk
> <news at terrainformatica.com  <https://mail.mozilla.org/listinfo/es-discuss>> wrote:
> >/  Did we consider green gases emission increase that will happen due to that
> />/   double parsing of code that currently is parsed strictly once?
> />/
> />/  ( Consider that rhetoric question above as just a reminder that ES6
> />/    parser will be used for existing web code when it will deployed for
> />/    pretty much each connected machine and device ).
> /
> Hmm. Well, there is a lot of back and forth in those lines, so I'm not
> sure what to make of them. Either the new features are wasteful or
> they're not. Pick a position and quantify it. Let's have it out.
>
> [...]
>
> In the case of destructuring assignment, we don't rewind and do a
> second parse, even if it *does* turn out that you're doing
> destructuring assignment. We just have to re-interpret the AST for the
> left-hand side that we just parsed as a left-hand side.

And it almost works [1] . ;-)   But no worries, it will only get worse 
when you add destructuring with defaults to the mix or the 
CoverInitialisedName production.  :-/
Like this `({a} = {}) => {}` or this `({a = {}} = {}) => {}` ...


- André


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=866624


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130712/cb724bef/attachment.html>

# Brendan Eich (12 years ago)

Jason Orendorff wrote:

Note that a similar shift in meaning occurs in ES1-5 after you've parsed x.y and then you see ( or = or ++. When we reach any of those tokens, we have to go back and check that the left hand side is valid, and this is nothing to do with any new ES6 features. (Admittedly, unlike the new features, this is something you can implement using finite lookahead. SpiderMonkey must have done it that way, once, before my time. Seems like it'd be messy.)

Very messy, and primordial JS (the "Mocha" interpreter) violated ECMA-262 Edition 1 on order of evaluation, e.g., o[x] = ++x where the RHS ++x must be evaluated after the Reference o[x] is evaluated. Fixing this prior to compiling to a parse tree required code buffering and reordering, just as the for loops (both of 'em) can.

Again, this goes back to the dawn of JS. We're not making it any worse. I don't see finite lookahead helping in general for any of these cases, BTW.

Jason Orendorff wrote:
> Note that a similar shift in meaning occurs in ES1-5 after you've
> parsed `x.y` and then you see `(` or `=` or `++`. When we reach any of
> those tokens, we have to go back and check that the left hand side is
> valid, and this is nothing to do with any new ES6 features.
> (Admittedly, unlike the new features, this is something you can
> implement using finite lookahead. SpiderMonkey must have done it that
> way, once, before my time. Seems like it'd be messy.)

Very messy, and primordial JS (the "Mocha" interpreter) violated 
ECMA-262 Edition 1 on order of evaluation, e.g., o[x] = ++x where the 
RHS ++x must be evaluated *after* the Reference o[x] is evaluated. 
Fixing this prior to compiling to a parse tree required code buffering 
and reordering, just as the for loops (both of 'em) can.

Again, this goes back to the dawn of JS. We're not making it any worse. 
I don't see finite lookahead helping in general for any of these cases, BTW.

/be

# Andrew Fedoniouk (12 years ago)

On Thu, Jul 11, 2013 at 5:08 PM, Brendan Eich <brendan at mozilla.com> wrote:

Very messy, and primordial JS (the "Mocha" interpreter) violated ECMA-262 Edition 1 on order of evaluation, e.g., o[x] = ++x where the RHS ++x must be evaluated after the Reference o[x] is evaluated. Fixing this prior to compiling to a parse tree required code buffering and reordering, just as the for loops (both of 'em) can.

Again, this goes back to the dawn of JS. We're not making it any worse. I don't see finite lookahead helping in general for any of these cases, BTW.

Compiling this

arr[x] = ++x;

does not require complete AST support as far as I can tell.

Consider this:

Some hypothetical bytecode:

PUSH arr;
PUSH x;
--
<RHS code> -> VAL in accumulator register
--
SETVI;  //  stack[TOP-1] <- arr
            //  stack[TOP] <- x, index, initial value of x
            //  accum  <- x, final RHS value, x after increment DROP2;
// accum here contains x value after increment - result
// of the assignment expression.

So that can be compiled to stack machine strictly in order it is defined.

Or do you mean something else here?

On Thu, Jul 11, 2013 at 5:08 PM, Brendan Eich <brendan at mozilla.com> wrote:

>
> Very messy, and primordial JS (the "Mocha" interpreter) violated ECMA-262
> Edition 1 on order of evaluation, e.g., o[x] = ++x where the RHS ++x must be
> evaluated *after* the Reference o[x] is evaluated. Fixing this prior to
> compiling to a parse tree required code buffering and reordering, just as
> the for loops (both of 'em) can.
>
> Again, this goes back to the dawn of JS. We're not making it any worse. I
> don't see finite lookahead helping in general for any of these cases, BTW.
>
> /be

Compiling this
   arr[x] = ++x;
does not require complete AST support as far as I can tell.

Consider this:

Some hypothetical bytecode:

PUSH arr;
PUSH x;
--
<RHS code> -> VAL in accumulator register
--
SETVI;  //  stack[TOP-1] <- arr
            //  stack[TOP] <- x, index, initial value of x
            //  accum  <- x, final RHS value, x after increment
DROP2;
// accum here contains x value after increment - result
// of the assignment expression.

So that can be compiled to stack machine strictly in order
it is defined.

Or do you mean something else here?

--
Andrew Fedoniouk.

http://terrainformatica.com

# Brendan Eich (12 years ago)

Andrew Fedoniouk wrote:

So that can be compiled to stack machine strictly in order it is defined.

Or do you mean something else here?

You're right, that case can be handled, but the for loops and the left-hand side revisions remain.

Andrew Fedoniouk wrote:
> So that can be compiled to stack machine strictly in order
> it is defined.
>
> Or do you mean something else here?

You're right, that case can be handled, but the for loops and the 
left-hand side revisions remain.

/be

# Brendan Eich (12 years ago)

Brendan Eich wrote:

Andrew Fedoniouk wrote:

So that can be compiled to stack machine strictly in order it is defined.

Or do you mean something else here?

You're right, that case can be handled, but the for loops and the left-hand side revisions remain.

Sorry, callee revisions.

Depending on how you generate code, you can do a lot in one pass, but the for loops require reorder buffers, and using too slow a target machine just to get one pass compile can lose if the code is hot.

But the point remains: the language never had single-pass no-reorder codegen, in 1995 from its birth onward.

Brendan Eich wrote:
> Andrew Fedoniouk wrote:
>> So that can be compiled to stack machine strictly in order
>> it is defined.
>>
>> Or do you mean something else here?
>
> You're right, that case can be handled, but the for loops and the 
> left-hand side revisions remain.

Sorry, callee revisions.

Depending on how you generate code, you can do a lot in one pass, but 
the for loops require reorder buffers, and using too slow a target 
machine just to get one pass compile can lose if the code is hot.

But the point remains: the language never had single-pass no-reorder 
codegen, in 1995 from its birth onward.

/be