[syntax] arrow function notation is too "greedy"

# Andrew Fedoniouk (12 years ago)

I see 1 that arrow function syntax got specification state.

The problem with this construction: it's the only feature in ES syntax so far that requires compiler to use full AST implementation at compilation phase.

Consider this:

var val1 = 1;
var val2 = 2;
var sum1 = (val1, val2) + 1; // integer
var sum2 = (val1, val2) => val1 + val2 ; // function

var sum3 = (val1, true) ; // true

As comma is a valid operator in ES than to generate code for the expression (val1, val2) is only possible when next token past ')' is seen. That requires as I said full blown AST analysis - list in '(' ')' parenthesis shall be parsed and stored. Only next token triggers how that list will be interpreted.

So far almost all other syntax features in ES can be parsed with single token look ahead. That makes possible quite fast source-> bytecode translation schema that makes great sense for the embeddable languages like ES.

Is that arrow syntax still debatable?

Asking because Ruby-ish syntax 2 is significantly better in this respect.

And in TIScript 3 I am using similar approach but with ':' symbol:

':' <param-list> ':' <expresion>
':' <param-list> '{' <statement-list> '}'

Example, descending sorting:

arr.sort( :a,b: a - b );

Too late?

# Allen Wirfs-Brock (12 years ago)

On Jul 10, 2013, at 9:39 PM, Andrew Fedoniouk wrote:

I see [1] that arrow function syntax got specification state.

so, you should probably look at the relevant parts of the draft specification

The problem with this construction: it's the only feature in ES syntax so far that

no, it also occurs between object literals and destructuring assignment object patterns.

requires compiler to use full AST implementation at compilation phase.

you don't necessarily have to built a full AST. In some cases you will need to at least do multiple parses of the same text.

Is that arrow syntax still debatable?

No really. This was known when the arrow syntax was choses.

Too late?

Yes, but primarily in the sense that such issues were already considered.

# Andrew Fedoniouk (12 years ago)

On Wed, Jul 10, 2013 at 10:46 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

no, it also occurs between object literals and destructuring assignment object patterns.

So that "fat arrow" and peculiar destructing assignment are the only features that trigger such spike in syntax complexity?

Will language survive without these two?

you don't necessarily have to built a full AST. In some cases you will need to at least do multiple parses of the same text.

Hmm... I see that while getting '(' you will need that double parsing almost always.

Is that arrow syntax still debatable?

Too pity.

Yes, but primarily in the sense that such issues were already considered.

Did we consider green gases emission increase that will happen due to that double parsing of code that currently is parsed strictly once?

( Consider that rhetoric question above as just a reminder that ES6 parser will be used for existing web code when it will deployed for pretty much each connected machine and device ).

# Brendan Eich (12 years ago)

Andrew Fedoniouk wrote:

So that "fat arrow" and peculiar destructing assignment are the only features that trigger such spike in syntax complexity?

No, JS already has (since day 1) had issues with for loops (both kinds), where any translation must reorder parts, sometimes across arbitrary sub-statement bodies.

Will language survive without these two?

They're in implementations already and being codified in ES6. I suggest you are tilting at a windmill here.

# Jason Orendorff (12 years ago)

On Thu, Jul 11, 2013 at 2:00 AM, Andrew Fedoniouk <news at terrainformatica.com> wrote:

Did we consider green gases emission increase that will happen due to that double parsing of code that currently is parsed strictly once?

( Consider that rhetoric question above as just a reminder that ES6 parser will be used for existing web code when it will deployed for pretty much each connected machine and device ).

Hmm. Well, there is a lot of back and forth in those lines, so I'm not sure what to make of them. Either the new features are wasteful or they're not. Pick a position and quantify it. Let's have it out.

I think any argument from energy savings is equivalent to a performance argument, and in Firefox we would not have landed these features if they regressed any of the benchmarks we watch. Perhaps it is relevant that in Firefox, the cost is only paid when an arrow function or a destructuring assignment is actually used.

In the case of destructuring assignment, we don't rewind and do a second parse, even if it does turn out that you're doing destructuring assignment. We just have to re-interpret the AST for the left-hand side that we just parsed as a left-hand side.

Yet another situation where the meaning of an expression changes after it's parsed is introduced by default arguments, in combination with strict mode.

function f(a = function() {...}) {
    "use strict";
    return a;
}

As I understand it, the function in the default-argument-expression is strict, but our parser doesn't know it until we hit the "use strict" directive. In Firefox, this was a bigger pain for us than arrow functions or destructuring assignment. We rewind and reparse f.

Note that a similar shift in meaning occurs in ES1-5 after you've parsed x.y and then you see ( or = or ++. When we reach any of those tokens, we have to go back and check that the left hand side is valid, and this is nothing to do with any new ES6 features. (Admittedly, unlike the new features, this is something you can implement using finite lookahead. SpiderMonkey must have done it that way, once, before my time. Seems like it'd be messy.)

# André Bargull (12 years ago)

In the case of destructuring assignment, we don't rewind and do a second parse, even if it does turn out that you're doing destructuring assignment. We just have to re-interpret the AST for the left-hand side that we just parsed as a left-hand side.

And it almost works . ;-) But no worries, it will only get worse when you add destructuring with defaults to the mix or the CoverInitialisedName production. :-/ Like this ({a} = {}) => {} or this ({a = {}} = {}) => {} ...

# Brendan Eich (12 years ago)

Jason Orendorff wrote:

Note that a similar shift in meaning occurs in ES1-5 after you've parsed x.y and then you see ( or = or ++. When we reach any of those tokens, we have to go back and check that the left hand side is valid, and this is nothing to do with any new ES6 features. (Admittedly, unlike the new features, this is something you can implement using finite lookahead. SpiderMonkey must have done it that way, once, before my time. Seems like it'd be messy.)

Very messy, and primordial JS (the "Mocha" interpreter) violated ECMA-262 Edition 1 on order of evaluation, e.g., o[x] = ++x where the RHS ++x must be evaluated after the Reference o[x] is evaluated. Fixing this prior to compiling to a parse tree required code buffering and reordering, just as the for loops (both of 'em) can.

Again, this goes back to the dawn of JS. We're not making it any worse. I don't see finite lookahead helping in general for any of these cases, BTW.

# Andrew Fedoniouk (12 years ago)

On Thu, Jul 11, 2013 at 5:08 PM, Brendan Eich <brendan at mozilla.com> wrote:

Very messy, and primordial JS (the "Mocha" interpreter) violated ECMA-262 Edition 1 on order of evaluation, e.g., o[x] = ++x where the RHS ++x must be evaluated after the Reference o[x] is evaluated. Fixing this prior to compiling to a parse tree required code buffering and reordering, just as the for loops (both of 'em) can.

Again, this goes back to the dawn of JS. We're not making it any worse. I don't see finite lookahead helping in general for any of these cases, BTW.

Compiling this

arr[x] = ++x;

does not require complete AST support as far as I can tell.

Consider this:

Some hypothetical bytecode:

PUSH arr;
PUSH x;
--
<RHS code> -> VAL in accumulator register
--
SETVI;  //  stack[TOP-1] <- arr
            //  stack[TOP] <- x, index, initial value of x
            //  accum  <- x, final RHS value, x after increment DROP2;
// accum here contains x value after increment - result
// of the assignment expression.

So that can be compiled to stack machine strictly in order it is defined.

Or do you mean something else here?

# Brendan Eich (12 years ago)

Andrew Fedoniouk wrote:

So that can be compiled to stack machine strictly in order it is defined.

Or do you mean something else here?

You're right, that case can be handled, but the for loops and the left-hand side revisions remain.

# Brendan Eich (12 years ago)

Brendan Eich wrote:

Andrew Fedoniouk wrote:

So that can be compiled to stack machine strictly in order it is defined.

Or do you mean something else here?

You're right, that case can be handled, but the for loops and the left-hand side revisions remain.

Sorry, callee revisions.

Depending on how you generate code, you can do a lot in one pass, but the for loops require reorder buffers, and using too slow a target machine just to get one pass compile can lose if the code is hot.

But the point remains: the language never had single-pass no-reorder codegen, in 1995 from its birth onward.