Automatic Semicolon Insertion: value vs cost; predictability and control; alternatives

# Claus Reinke (13 years ago)

The idea of ASI seems to be to reduce syntactic clutter, possibly making programs more readable, which is a laudable goal. But if the reduction in symbol noise comes at the cost of a rise in complexity of error-prone interpretation, that actually reduces readability. And few things frustrate programmers more than not being able to predict or control what happens to their code (subjectively, not objectively: preferably without detailed spec).

That is by no means a new problem [2,3], but a solution seems hard to come by, as summarized by Brendan in [3]:

Just the emotion around ASI makes me want to reach for
greater clarity and (if possible) improvements down the line.
But yeah, it's low priority and the risk for reward looks high.

Javascript is not the only language with some form of semicolon insertion, and programmer satisfaction with this feature seems to vary widely across languages, suggesting that implementations of ASI differ as well.

These differences are important - experiences with related, but different features in different languages should inform, not bias discussion of ASI in Javascript. and different tools that achieve similar goals might offer additional design options.

So, I thought it might be helpful to contrast Javascript's approach with one that does not seem to stir up so much negative emotion.

// Javascript semicolon insertion (ASI)

There are several aspects of Javascript ASI that I find worrying:

  • ASI is triggered by linebreaks
  • ASI depends on error correction
  • ASI depends on restricted productions

Taken together, this reduces both predictability of and control over semicolon insertion:

  • there is no rule-of-thumb understanding (programmers have to look up or memorize all restricted productions, and notice all errors in their code, to the extent that both of these control ASI; if they miss an error that triggers ASI, the code they are looking at is not the same code that the JS implementation sees, hindering debugging)

  • there is little programmer control (programmers can add linebreaks, but that alone isn't sufficient; only combining linebreaks with restricted productions or correctable errors will result in ASI; programmers sometimes add linebreaks, not knowing that or when this will invoke ASI, assuming that linebreaks are just whitespace; programmers sometimes omit semicolons, erroneously assuming that ASI will fix it)

[Btw, it would be great if one could write something like /*OPTIONS: warn-ASI */ and get parser warnings whenever ASI kicks in (during development). It might be useful for ES to standardize the idea of such pragmas (passing hints and options to tools, in source comments), since they are already in use (eg, jslint).]

// Haskell semicolon insertion (HSI)

For comparison, if we take Haskell's semicolon insertion (HSI), and throw out all the special cases not needed for Javascript, the rules are simple, predictable, and fully under programmer control (rather than grammar author control):

1 semicolon insertion happens for syntax involving blocks
    (always preceded by some keyword):
    <keyword> { .. ; .. ; .. }

2 if the opening brace following such a <keyword> is
    omitted, the start-column of the next token establishes
    a baseline for automatic semicolon/end brace insertion

3 following lines beginning with a non-white token that is
    - indented more: continue the preceding statement
    - indented equally: start a new statement in the block
    - indented less: end the block

That is it. HSI has the following useful properties:

- if programmers use a construct that uses braces and
    semicolons, they don't have to look in the grammar
    for details, they know that HSI will be possible (1)

- if programmers do not want HSI, all they need to do
    is make their braces and semicolons explicit (2);
    (HSI will not insert additional semicolons if all
        braces are explicit)

- programmers can use linebreaks to clean up their code;
    the _combination_ of linebreak and relative indentation
    (more/equal/less) of the next line controls HSI (3)

- only omitted braces and indentation control HSI (2,3);
    in particular, semicolons are not inserted to correct
    errors, and HSI behaves uniformly for all blocks and
    all kinds of statement (again, no need to consult the
    grammar for restricted productions)

- since semicolon insertion is controlled by indentation,
    not error correction, it does not limit grammar design
    (no new ambiguities due to interaction with HSI)

The differences between the two approaches to semicolon insertion are substantial (programmer control/systematic predictability vs grammar author control/memorization of special cases and attempted error correction).

When I first encountered HSI, I wrote all my {;} explicitly, because it was always presented as a "layout rule" and I didn't feel comfortable with that.

Predictability (both in reading and in writing code) and the reduction of syntax noise soon won me over. Still, it is useful to have the option of no HSI interference, if one generates code with a simple tool, if one wants to make all inserted {;} explicit, or when whitespace is messed with (emails). Also, explicit and implicit style can be combined.

The system has been working remarkably well in practice (the main no-nos are mixing tabs and spaces, or tools that meddle with whitespace). HSI reinforces the common practice of indenting nested blocks, while ASI seems to have no such intuitive guidelines.

Many, though not all, examples of programmers getting into trouble with Javascript's ASI run against indentation expectations (nesting return value on next line and still getting ASI, having separate lines with same indent merged because ASI does not kick in).

I'm throwing this alternative in the ring because I've seen discussions on seemingly unrelated spec issues where suddenly, people would say they'd have to check whether some idea works out with restricted productions (often, suggested new syntax turns out to be ambiguous when combined with ASI).

Also, some of the pessimism surrounding ASI reform [3] stems from the limitations of current spec tools, such as restricted productions, so looking at other ways to insert semicolons might help.

It is interesting that even the ES5 spec has no convincing ASI examples, only clarifying examples (7.9.2). And blog posts seem to be more about trouble with ASI than about usefulness of ASI [2,4,5]. So ASI as it stands in Javascript now does not only make life harder for programmers but for spec writers (and readers), too.

It would be useful to know examples of ASI working well for someone. Then one could check whether the benefits could be achieved by alternate rules, while reducing the danger that programmer and compiler have different interpretations of the same code.

HSI would need some tweaking to be suitable for Javascript coding styles (though some tweaks could be copied from Haskell, I just omitted them to bring out the core ideas).

Still, such a variant might work better and be easier to understand than the current ASI. Equally important, a transition might be doable as incremental improvements rather than a radically different system. For instance, one could weaken the no-line-break-here token to consider line-break plus indentation. One might drop the error-correction bits if indentation provides alternative control.

In the spirit of refactoring languages in small steps, improving ASI might be more manageable than removing it (and if ASI is worth doing, it is worth doing it well).

Claus

[1] www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-210002.7 [2] lucumr.pocoo.org/2011/2/6/automatic-semicolon-insertion [3] old.nabble.com/Rationalizing-ASI-(was%3A-simple-shorter-function-syntax)-td29256435.html [4] asi.qfox.nl (ASI certification) [5] inimino.org/~inimino/blog/javascript_semicolons

# Brendan Eich (13 years ago)

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote:

  • there is no rule-of-thumb understanding (programmers have to look up or memorize all restricted productions,

Here is a quibble: there is a rule, or set of rules enumerated by restricted productions.

So indeed, viewed production by production, there are too many rules to memorize, whether concrete or abstracted only a little bit (e.g., either break or continue with label is restricted to have [no LineTerminator here] between the keyword and the label).

However, one can abstract further by thinking about all the goto-like forms being restricted (continue, break, return, and for unclear reasons since the expression is not optional, throw).

This does not cover postfix ++/--, so two rules. Not quite as bad as you wrote.

The bigger problem is not the rule-space but the mixed significance and insignificance of line terminators.

What we observe is that programmers come to expect ASI where there is no ASI. So, e.g.,

foo (bar, baz);

involves no ASI (no error to correct, no restricted production), but it looks like two statements. The line terminator having selective meaning due to ASI as an error correction procedure, and of course in restricted productions, creates an expectation that line terminators matter in general.

However, going down that road leads to CoffeeScript, or (somewhat more conservatively due to the use of : at end of head forms, and a lot older) Python. It's a steep slippery slope.

For comparison, if we take Haskell's semicolon insertion (HSI), and throw out all the special cases not needed for Javascript, the rules are simple, predictable, and fully under programmer control (rather than grammar author control):

1 semicolon insertion happens for syntax involving blocks (always preceded by some keyword): <keyword> { .. ; .. ; .. }

2 if the opening brace following such a <keyword> is omitted, the start-column of the next token establishes a baseline for automatic semicolon/end brace insertion

3 following lines beginning with a non-white token that is - indented more: continue the preceding statement

This is going to enrage CoffeeScripters and Pythonistas, and with good reason. They'll want a new block (no leading keyword required), not continuation of the preceding line. Especially with let, const, and block-local functions.

I'm not saying this is "bad" or "good" on balance. Indeed, it beats having to -escape a newline to continue an overlong statement in Python.

But it is different yet again from nearby or layered languages, and of course it's not ASI as JS has had since forever.

 - indented equally: start a new statement in the block

This satisfies the expectation created by the selective meaning of line terminators.

 - indented less: end the block

This is great if you buy the rest.

Predictability (both in reading and in writing code) and the reduction of syntax noise soon won me over. Still, it is useful to have the option of no HSI interference, if one generates code with a simple tool, if one wants to make all inserted {;} explicit, or when whitespace is messed with (emails). Also, explicit and implicit style can be combined.

Good points. The keystroke tax with ; and {} in JS is an issue. It is a tax at the margin on all effort creating and maintaining source, though, so I expect over time lower-tax syntaxes to win. The trick is migrating JS code into a new standardized edition without creating new runtime errors by failing to catch migration errors.

Also, some of the pessimism surrounding ASI reform [3] stems from the limitations of current spec tools, such as restricted productions, so looking at other ways to insert semicolons might help.

Reformulating the spec could be done but we would need to keep the "ES5" or "classic" ASI spec around. Spec complexity and the opportunity cost of the work to increase it in this area will hurt -- probably a lot.

Anyway, we'd need a more complete strawman spec to evaluate, to get further.

It is interesting that even the ES5 spec has no convincing ASI examples, only clarifying examples (7.9.2).

That is the same old language from ES1, IIRC.

And blog posts seem to be more about trouble with ASI than about usefulness of ASI [2,4,5].

Beware negativity and confirmation biases here.

ASI is relied on by tons of content, without complaints (or praise). It goes without notice when it works, which is often when a ; was left off where the formal grammar requires it.

So ASI as it stands in Javascript now does not only make life harder for programmers but for spec writers (and readers), too.

The spec didn't change, so we are riding on that sunk cost.

It's easy to exaggerate here, but it seems to me the big deal is not ASI costs already sunk. Rather it is how to lighten the syntax and make ASI more usable, in a new edition.

It would be useful to know examples of ASI working well for someone. Then one could check whether the benefits could be achieved by alternate rules, while reducing the danger that programmer and compiler have different interpretations of the same code.

We know ASI is used, you can log SpiderMonkey code to see where it kicks in. I haven't done this lately and I'm not going to attempt any kind of "meaningful" web JS survey, but we don't need to, in my view. What we could use is some validated alternative that has no runtime migration error gotchas.

HSI would need some tweaking to be suitable for Javascript coding styles (though some tweaks could be copied from Haskell, I just omitted them to bring out the core ideas).

I think your effort developing a draft spec would be great. Just citing Haskell or talking about the issues more generally is not going to move the mountain that needs to move here.

Still, such a variant might work better and be easier to understand than the current ASI. Equally important, a transition might be doable as incremental improvements rather than a radically different system.

While new Harmony proposals will be prototyped and even shipped before the next edition, it takes years to do a new edition. So we do not have the luxury of many standardized incremental improvements.

Further, standarding increments along an uncertain path creates bad path dependence, as content comes to depend on each increment in turn, until you may well be painted into a corner.

For instance, one could weaken the no-line-break-here token to consider line-break plus indentation.

We tried this back at the July 2008 "Harmony" (Oslo) meeting. The issue was

function foo(x) { if (x) return "a very long string here that did not fit on the previous line"; return "short"; }

Any code of this form today is probably a bug: the programmer forgot about return's production being restricted, and created a dead (unreachable) and useless string literal expression statement.

However, we did not want to require the kind of analysis that would be needed to distinguish that from this:

function foo(x) { if (x) return some_long_and_complex(expressions(), with(effects())); return simple(); }

In this case it is not safe to assume (per HSI) that the line after the return is a continuation of the return statement.

There is mis-indented JS on the web, including of this form, and what such code means (whatever was intended) is now a compatibility constraint.

One might drop the error-correction bits if indentation provides alternative control.

The formal grammar currently requires ; as statement terminator, so ASI is an error correction procedure (ignoring restricted productions). That is a pragmatic decision (in my code in 1995, and in the spec's choice of formalisms). It could be revisited, but the devil is in the details, and it is costly work.

In the spirit of refactoring languages in small steps, improving ASI might be more manageable than removing it (and if ASI is worth doing, it is worth doing it well).

ASI is not going to be removed. I don't know why you think it could be.

Again, we don't get standardized small steps. We need something more than user-tested single-source new parser code (a la CoffeeScript, which I admire -- just saying we can't standardize anything like its lexer/disambiguator/parser code). We need at least a "HSI for JS" spec with more details than in your message, and (especially) careful analysis of how migration would work.

I'm still skeptical this is (a) doable with only early errors when migrating; (b) worth the up-front and ongoing costs, since we will need to keep ASI spec'ed forever (for web compatibility of not-opted-into-Harmony JS).

But since you wrote a nice post and seem motivated, I do want to encourage you to work out more details. We don't need more motivation or (dubious IMHO) attitudinizing about ASI :-/.

# Garrett Smith (13 years ago)

On 4/17/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote: [TLDR] ASI is not going to be removed. I don't know why you think it could be.

Why not? Iif developers would stop doing that then eventually, can't it be removed?

It is not hard at all to write code that does not rely on ASI.

Existing code that relies on ASI would fail, so making ASI an error today would not be feasible because many sites would break.

It might sound naive, but can ASI be phased-out? How can we get developers to stop using ASI?

Can ASI be made to be an error for strict mode? IIRC Asen proposed that a couple of years ago. What if ASI were to trigger a warning to developers?

ES5 does not define warning. By warning, I mean an error condition that is reported by the implementation but does not trigger abrupt completion ("Deprecated production: missing semicolon, line 29"). The warning may be hidden but activated in a debugging environment (as optionally provided by the implementation).

# François REMY (13 years ago)

I’m a developper and I don’t want to phase ASI out. If you consider Visual Basic .NET, they are starting to relax line ending, and it’s really appraised.

Typing semicolons is pointless. It doesn’t add any value to your code. It only takes time.

From: Garrett Smith Sent: Sunday, April 17, 2011 7:07 PM To: Brendan Eich Cc: es-discuss at mozilla.org Subject: Re: Automatic Semicolon Insertion: value vs cost; predictability andcontrol; alternatives

On 4/17/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote: [TLDR] ASI is not going to be removed. I don't know why you think it could be.

Why not? Iif developers would stop doing that then eventually, can't it be removed?

It is not hard at all to write code that does not rely on ASI.

Existing code that relies on ASI would fail, so making ASI an error today would not be feasible because many sites would break.

It might sound naive, but can ASI be phased-out? How can we get developers to stop using ASI?

Can ASI be made to be an error for strict mode? IIRC Asen proposed that a couple of years ago. What if ASI were to trigger a warning to developers?

ES5 does not define warning. By warning, I mean an error condition that is reported by the implementation but does not trigger abrupt completion ("Deprecated production: missing semicolon, line 29"). The warning may be hidden but activated in a debugging environment (as optionally provided by the implementation).

# Dmitry A. Soshnikov (13 years ago)

On 17.04.2011 21:07, Garrett Smith wrote:

On 4/17/11, Brendan Eich<brendan at mozilla.com> wrote:

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote: [TLDR] ASI is not going to be removed. I don't know why you think it could be.

Why not? Iif developers would stop doing that then eventually, can't it be removed?

It is not hard at all to write code that does not rely on ASI.

The question is not that it's hard or not. IMO an explicit semicolon is really a syntactic noise in the language. In goes to the early era of C/Pascal, etc. Since usually, a programmer puts a one logical sentences per line -- for what you need additional statement/expression termination besides the new line itself? The useless actions in this case can be compared with masochism.

So it's a noise. But a completely different thing is the implementation of ASI and all subtle/tricky/buggy cases related with it which exactly causes dislike of the ASI. If the ASI had been implemented perfectly

# Mike Ratcliffe (13 years ago)

An HTML attachment was scrubbed... URL: esdiscuss/attachments/20110417/a4e43632/attachment

# Garrett Smith (13 years ago)

On 4/17/11, Mike Ratcliffe <mratcliffe at mozilla.com> wrote:

I remember going over a few hundred thousand lines of JavaScript and adding semicolons because I had decided to minify it. I also remember that for months I was receiving bug reports from sections of code where I had missed the semicolons.

Now I am obsessed with adding semicolons everywhere that they should go.

ASI changes program behavior.

Worthy (IMBO) reads on Jim's fickle server: jibbering.com/faq/notes/code-guidelines/#asi, jibbering.com/faq/notes/code-guidelines/asi.html

There are plenty of beginners who haven't experienced those problems (see SO for details).

Personally I would welcome some kind of option to disable ASI with open arms. Garrett's strict mode warning idea makes sense to me but I am fairly certain that not everybody would welcome it.

IIRC, it was Asen who first proposed that idea. COuldn't find the actual post, but: dmitrysoshnikov.com/ecmascript/the-quiz/#comment-8

It would also take some evangelism to explain to beginners who don't see the problems. Again, see stackoverflow.

# Allen Wirfs-Brock (13 years ago)

On Apr 17, 2011, at 12:33 PM, Mike Ratcliffe wrote:

...

Personally I would welcome some kind of option to disable ASI with open arms. Garrett's strict mode warning idea makes sense to me but I am fairly certain that not everybody would welcome it. ~

I'd suggest that this isn't really a standards issue. The standard does not prevent an implementation from providing whatever sort of supplemental diagnostic output it deems appropriate. You could even issue warnings recommending another programming language if you wanted. It also doesn't block an implementation from providing a user selected mode that excludes certain standard features such as ASI, it just means that when operating in that mode the implementation isn't conforming to the standard.

To be compatible with the standard and the web, an ECMAScript implementation is still going to have to default to accepting code that depends upon ASI. However, the implementation can gripe about it all it wants on a diagnostic channel.

# Jason Orendorff (13 years ago)

On Sun, Apr 17, 2011 at 12:07 PM, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

On 4/17/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote: [TLDR] ASI is not going to be removed. I don't know why you think it could be.

Why not? Iif developers would stop doing that then eventually, can't it be removed?

We have a saying at Mozilla: "Don't break the web."

If a browser vendor removed ASI support from their ES engine, many existing sites would stop working. Their users would switch to other browsers.

ES5 does not define warning. By warning, I mean an error condition that is reported by the implementation but does not trigger abrupt completion ("Deprecated production: missing semicolon, line 29")

Most ES runs in browsers. If a browser shows a warning, it shows it to the end user of a web site--the wrong person.

To put this another way, if a browser started showing warnings for ASI, their users would be informed that many existing sites are deprecated productions or something. They would switch to other browsers.

# Garrett Smith (13 years ago)

On 4/17/11, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

On Apr 17, 2011, at 12:33 PM, Mike Ratcliffe wrote:

...

Personally I would welcome some kind of option to disable ASI with open arms. Garrett's strict mode warning idea makes sense to me but I am fairly certain that not everybody would welcome it. ~

I'd suggest that this isn't really a standards issue. The standard does not prevent an implementation from providing whatever sort of supplemental diagnostic output it deems appropriate. You could even issue warnings recommending another programming language if you wanted. It also doesn't block an implementation from providing a user selected mode that excludes certain standard features such as ASI, it just means that when operating in that mode the implementation isn't conforming to the standard.

To be compatible with the standard and the web, an ECMAScript implementation is still going to have to default to accepting code that depends upon ASI. However, the implementation can gripe about it all it wants on a diagnostic channel.

Which major browser implementations discouraging developers from using ASI and how effective is it?

Implementations are motivated to get scripts working and conform to specs. How could Ecma encourage developers to stop using ASI? I initially thought that standard warnings in strict mode would help.

# Mikeal Rogers (13 years ago)

An HTML attachment was scrubbed... URL: esdiscuss/attachments/20110417/5c506b13/attachment-0001

# Garrett Smith (13 years ago)

On 4/17/11, Jason Orendorff <jason.orendorff at gmail.com> wrote:

On Sun, Apr 17, 2011 at 12:07 PM, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

On 4/17/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote: [TLDR] ASI is not going to be removed. I don't know why you think it could be.

Why not? Iif developers would stop doing that then eventually, can't it be removed?

We have a saying at Mozilla: "Don't break the web."

If a browser vendor removed ASI support from their ES engine, many existing sites would stop working. Their users would switch to other browsers.

I wrote eventually. As naive as it may seem, it is conceivable that newly written scripts could be authored so that they don't rely on ASI ("yeah right," you say). And if that happened then the number of scripts that rely on ASI would diminish in number.

ASI is often unintentional. If I do that (and I have) I definitely wnat to know right away; not after I deploy my app. Intell-J warns for this in the js editor. A standard warning for ASI would be helpful in this way.

I know it sounds fantastical, but before saying "impossible," think about this: What effectively discourage authors from using ASI (intentionally or otherwise)?

ES5 does not define warning. By warning, I mean an error condition that is reported by the implementation but does not trigger abrupt completion ("Deprecated production: missing semicolon, line 29")

Most ES runs in browsers. If a browser shows a warning, it shows it to the end user of a web site--the wrong person.

Why must the browser show the message to the end user? I didn't suggest that, so why are you? The implementation (script engine) can emit warnings (for Firebug, MSIE debugger, etc).

My idea for "standard warnings" comes from my contempt for things like this:

"It is recommended that ECMAScript implementations either disallow this usage of FunctionDeclaration or issue a warning when such a usage is encountered. " which mentions "warnings" but the spec doesn't say what a warning really is.

# Garrett Smith (13 years ago)

On 4/17/11, Mikeal Rogers <mikeal.rogers at gmail.com> wrote:

do modern javascript implementations actually "insert" semicolons?

Function.prototype.toString says "yes."

# Wes Garland (13 years ago)

On 17 April 2011 20:09, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

Function.prototype.toString says "yes."

That's not a really valid evaluation IMO. At least in mozilla's case, the semi colon appears in this by virtue of the bytecode decompiler putting a semicolon at the end of every statement. The source-code-as-compiled is not actually stored anywhere.

# Oliver Hunt (13 years ago)

An implementation could add a mode (shudder) along the same lines as strict mode: "die in hell ASI, i hate you with the fiery passion of a thousand burning suns.";

And then make it a syntax error whenever ASI would occur. I have considered this in JSC (albeit with a slightly shorter opt in string).

It wouldn't have the backwards compat problems you get by "disabling ASI" as the points where ASI being removed changes behaviour would be errors :D

# Oliver Hunt (13 years ago)

Implementation specific -- JSC Function.prototype.toString returns the exact input string for the body of the function.

# Garrett Smith (13 years ago)

On 4/17/11, Wes Garland <wes at page.ca> wrote:

On 17 April 2011 20:09, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

Function.prototype.toString says "yes."

That's not a really valid evaluation IMO. At least in mozilla's case, the semi colon appears in this by virtue of the bytecode decompiler putting a semicolon at the end of every statement. The source-code-as-compiled is not actually stored anywhere.

OK, thanks for pointing it out.

# Peter van der Zee (13 years ago)

On Mon, Apr 18, 2011 at 3:12 AM, Oliver Hunt <oliver at apple.com> wrote:

An implementation could add a mode (shudder) along the same lines as strict mode: "die in hell ASI, i hate you with the fiery passion of a thousand burning suns.";

And then make it a syntax error whenever ASI would occur. I have considered this in JSC (albeit with a slightly shorter opt in string).

It wouldn't have the backwards compat problems you get by "disabling ASI" as the points where ASI being removed changes behaviour would be errors :D

All things considered, another option for vendors is simply adding a developer setting in their browser that enables warnings (or errors) for ASI in the console. That would help a lot of current generation developers. Of course, this wouldn't fix anything for non-browsers (like node). So for them a directive would be nice, even if it was just to enable warnings while debugging.

# Claus Reinke (13 years ago)
  • there is no rule-of-thumb understanding (programmers have to look up or memorize all restricted productions,

Here is a quibble: there is a rule, or set of rules .. Not quite as bad as you wrote.

Sorry, I didn't notice that my attempted summary could be read as dramatizing the issue!-) The issue is overheated enough, my intention was merely to be concise in giving an impression of where I see the issues.

The bigger problem is not the rule-space but the mixed significance and insignificance of line terminators. .. .. The line terminator having selective meaning due to ASI as an error correction procedure, and of course in restricted productions, creates an expectation that line terminators matter in general.

However, going down that road leads to CoffeeScript, or (somewhat more conservatively due to the use of : at end of head forms, and a lot older) Python. It's a steep slippery slope.

I was trying to point out that CoffeeScript, Python, Ruby, .. all have subtly different systems, which in turn differ from Haskell's system, and others in that space. It is not useful to throw them into a single pigeon hole. The remainder or your reply shows that you are aware of that, so I am surprised that you start with this misleading statement!-)

It may be useful to look at the individual features of these different systems, not to make Python/CoffeeScript/Haskell coders feel at home, but to select a combination that fits Javascript, and Javascript coders' expectations. And finally stops getting in the way of Javascript language designers.

Good points. The keystroke tax with ; and {} in JS is an issue. It is a tax at the margin on all effort creating and maintaining source, though, so I expect over time lower-tax syntaxes to win. The trick is migrating JS code into a new standardized edition without creating new runtime errors by failing to catch migration errors.

Understood.

Reformulating the spec could be done but we would need to keep the "ES5" or "classic" ASI spec around. Spec complexity and the opportunity cost of the work to increase it in this area will hurt -- probably a lot.

Anyway, we'd need a more complete strawman spec to evaluate, to get further.

Since this is likely material for ES/next/next, and the proposal deadline for ES/next seems to be near, I'll focus on the callback nesting issue first.

However, it would be useful to have a Javascript implementation of a Javascript parser and unparser (precisely reproducing the source). Then one could prototype such syntax modifications as source-to-source translators, perhaps even testing whether a rewrite changes parse-followed-by-unparse results.

There are lots of parsers, and some pretty-printers, but I'd like it to be in Javascript (rather than in Haskell;-), so that some of you are likely to use it, and it needs to work for me on Windows. Or would you be happy if I used one of the Haskell libs for JS?-)

And blog posts seem to be more about trouble with ASI than about usefulness of ASI [2,4,5].

Beware negativity and confirmation biases here.

If did my references correctly, those posts were explanatory, in the direction: "you cannot avoid ASI, so you better understand how it works".

ASI is relied on by tons of content, without complaints (or praise). It goes without notice when it works, which is often when a ; was left off where the formal grammar requires it.

Very likely. Coders may not even be aware when they rely on ASI.

So ASI as it stands in Javascript now does not only make life harder for programmers but for spec writers (and readers), too.

The spec didn't change, so we are riding on that sunk cost.

I wasn't referring to the ASI spec. When browsing the list archives, I noticed that discussions of syntax tend to drift into parser issues, and sometimes, when contributors have almost settled on a suggestion, out of nowhere comes an ASI issue ("if we split this over two lines, it is going to be ambiguous"). Or simple operator proposals such as modulo operators have to consider restricted productions.

My impression was that ASI concerns are hampering language development, because ASI doesn't operate separately - it is entwined with the grammar.

It's easy to exaggerate here, but it seems to me the big deal is not ASI costs already sunk. Rather it is how to lighten the syntax and make ASI more usable, in a new edition.

Agreed. I just wanted to point out that ASI costs are ongoing.

What we could use is some validated alternative that has no runtime migration error gotchas.

But even if the alternative has a nice specification, how to validate that it doesn't ruin people's coding patterns without knowing what those coding patterns are?

HSI would need some tweaking to be suitable for Javascript coding styles (though some tweaks could be copied from Haskell, I just omitted them to bring out the core ideas).

I think your effort developing a draft spec would be great.

Good to know. No promises, though!-)

Still, such a variant might work better and be easier to understand than the current ASI. Equally important, a transition might be doable as incremental improvements rather than a radically different system.

While new Harmony proposals will be prototyped and even shipped before the next edition, it takes years to do a new edition. So we do not have the luxury of many standardized incremental improvements.

I was thinking about non-standard increments, just to see how any theoretically nice changes hold up in practice, before standardizing the successful candidates. Like expression closure and let-expressions, the changes would have to be marked as experimental/call-for-feedback only.

A pragma that lets developers trace when ASI kicks in might raise awareness and understanding of ASI. Then coders might be able to articulate what aspects of ASI they rely on, and what aspects might be open to change.

A standalone tool that transforms Javascript source by applying ASI, leaving everything else unchanged, would also help. Developers could look at diffs, and compare them to their expectations.

For instance, one could weaken the no-line-break-here token to consider line-break plus indentation.

We tried this back at the July 2008 "Harmony" (Oslo) meeting. .. However, we did not want to require the kind of analysis that would be needed to distinguish .. [return <break><space> string] [return <break><space> expression]

In this case it is not safe to assume (per HSI) that the line after the return is a continuation of the return statement.

There is mis-indented JS on the web, including of this form, and what such code means (whatever was intended) is now a compatibility constraint.

Yes. My suggestion was to take the assumptions out of the game - no complicated analysis, just indentation. But one would have to deprecate first, have tools warn developers if they use coding patterns that are going to break, then offer transition help. A longish process..

Again, we don't get standardized small steps. We need something more than user-tested single-source new parser code (a la CoffeeScript, which I admire -- just saying we can't standardize anything like its lexer/disambiguator/parser code). We need at least a "HSI for JS" spec with more details than in your message, and (especially) careful analysis of how migration would work.

I'm still skeptical this is (a) doable with only early errors when migrating; (b) worth the up-front and ongoing costs, since we will need to keep ASI spec'ed forever (for web compatibility of not-opted-into-Harmony JS).

But since you wrote a nice post and seem motivated, I do want to encourage you to work out more details.

Thanks for the useful explanations. It might be worth having them on the wiki, as an ASI-non-proposal with rationale.

Claus

# Jorge (13 years ago)

On 18/04/2011, at 09:52, Peter van der Zee wrote:

On Mon, Apr 18, 2011 at 3:12 AM, Oliver Hunt <oliver at apple.com> wrote:

An implementation could add a mode (shudder) along the same lines as strict mode: "die in hell ASI, i hate you with the fiery passion of a thousand burning suns.";

And then make it a syntax error whenever ASI would occur. I have considered this in JSC (albeit with a slightly shorter opt in string).

It wouldn't have the backwards compat problems you get by "disabling ASI" as the points where ASI being removed changes behaviour would be errors :D

All things considered, another option for vendors is simply adding a developer setting in their browser that enables warnings (or errors) for ASI in the console. That would help a lot of current generation developers. Of course, this wouldn't fix anything for non-browsers (like node). So for them a directive would be nice, even if it was just to enable warnings while debugging.

But there's many code :

a= b c= d function e () { f() }

that (ISTM) only works thanks to A(utomatic)S(emicolon)I(nsertion):

a= b; c= d; function e () { f(); }

so you won't want it to "die in hell" / issue any warnings / throw syntax errors...

What am I missing ?

# Peter van der Zee (13 years ago)

On Mon, Apr 18, 2011 at 12:34 PM, Jorge <jorge at jorgechamorro.com> wrote:

What am I missing ?

As far as the directive goes, they are opt-in. Old code won't be opting in. Other than that they have the same issues as "use strict" might have.

# Jorge (13 years ago)

On 18/04/2011, at 13:10, Peter van der Zee wrote:

On Mon, Apr 18, 2011 at 12:34 PM, Jorge <jorge at jorgechamorro.com> wrote: What am I missing ?

As far as the directive goes, they are opt-in. Old code won't be opting in. Other than that they have the same issues as "use strict" might have.

But why would anyone want to opt-in to get warnings or even worse syntax errors for code like this that depends on ASI, where ASI is helping out :

"die in hell ASI, i hate you with the fiery passion of a thousand burning suns."; // ^^^^^^^^^ the anti-ASI directive ^^^^^^^^^^

a= b c= d function e () { f() }

*** Warning missing semicolon @ line #4,5 *** Warning missing semicolon @ line #5,5 *** Warning missing semicolon @ line #6,19

Or even worse, halt the program with a:

Syntax error missing semicolon @ line #4,5

?

I understand that it would be quite interesting to get a warning/error in this case:

a= b (c= d)();

...only that there's no ASI in this case !

# xcv3000 (13 years ago)

Just to add some out-of-the-browser perspective to the discussion, and since Node.js was mentioned by name in this thread, I believe it is important to note that the package manager for Node.js absolutely depends on the automatic semicolon insertion working just the way it works today.

The only places where semicolons are ever used in the Node.js package manager are in the 'for' loops headers and at the beginning of the lines that would be interpreted incorrectly because of the lack of the semicolon at the end of the previous line - for example see: isaacs/npm/blob/master/lib/utils/read-json.js#L320-L336

There are no semicolons at the end of lines at all.

Also the commas in multi-line arrays and parameter lists are always written at the beginning of lines so Node also depends on the ASI not inserting semicolons if the next line starts with a comma or other operator.

I'm sure that I'm not the only developer that tries to follow the best practices taken from the Node community so there will be more and more new JavaScript code being written without semicolons and so in my opinion it cannot be considered as a problem of old and legacy code only.

Just to give some perspective - I am not trying to advocate using, not using, or changing the ASI. In fact I was writing JavaScript since the late nineties always using explicit semicolons until I found Joose, a modern object framework for JavaScript that recommends not using semicolons so I followed the recommended style with my best intentions. The style recommended by the Node.js package manager only reinforced my belief that this was in fact the modern way of writing JavaScript, not the legacy way. Now I see plans to deprecate or remove the ASI just like if it was used in old code only, while in my experience it's the opposite - old code uses semicolons and new code doesn't - Joose and Node.js all being very modern.

With all due respect, please consider any changes to the language from the perspective of modern frameworks like Node, Joose and I'm sure many others. Or is the Node community wrong in recommending the use of the ASI because it is or will be or might ever be deprecated or changed?

And since I am quite confused - should I stop omitting semicolons, despite the style used by Node and Joose? What is the safest way of writing code if I want to make my code work in the strict mode of new versions of ECMAScript in the future? What if I want to contribute back to the Joose and Node.js community?

My best to everyone involved in the evolution of ECMAScript.

Thanks, -XCV.

# Mike Ratcliffe (13 years ago)

An HTML attachment was scrubbed... URL: esdiscuss/attachments/20110418/a521d913/attachment

# Claus Reinke (13 years ago)

The only places where semicolons are ever used in the Node.js package manager are in the 'for' loops headers and at the beginning of the lines that would be interpreted incorrectly because of the lack of the semicolon at the end of the previous line - for example see: isaacs/npm/blob/master/lib/utils/read-json.js#L320-L336

There are no semicolons at the end of lines at all.

Also the commas in multi-line arrays and parameter lists are always written at the beginning of lines so Node also depends on the ASI not inserting semicolons if the next line starts with a comma or other operator.

Thanks. It is useful to have such concrete examples.

And since I am quite confused - should I stop omitting semicolons, despite the style used by Node and Joose? What is the safest way of writing code if I want to make my code work in the strict mode of new versions of ECMAScript in the future? What if I want to contribute back to the Joose and Node.js community?

Just an explanatory note from me, since I started this thread:

es-discuss is part of the public-facing interface to the Ecma TC39 committee. The committee "solicits continuing feedback from the community during the specification of ECMAScript":

http://www.ecmascript.org/community.php

So you're going to see members of the community (like myself) making suggestions or raising questions. Members of the TC39 committee (like Brendan) also do some of their discussion here, and respond to questions and suggestions (which is great, btw!-).

However, the detailed discussions and decisions happen elsewhere. You can see them taking shape on the Ecmascript wiki

http://wiki.ecmascript.org/doku.php

in the form of strawmen and proposals, and when those move into the Harmony namespace

http://wiki.ecmascript.org/doku.php?id=harmony:harmony

they are considered 'tentatively approved for the "ES-Harmony" language', which is the next upcoming revision of ES.

In other words, no matter what options are discussed here, until it appears on the wiki or is taken on by a committee member, it isn't even officially under consideration.

[aside: it would be nice to know who the committee members are, and what the process is for starting an official proposal]

For the specific case of ASI, Brendan has explained the issues

esdiscuss/2011-April/013794

In brief, there is no way simply to take away ASI, and any attempt to introduce a less troublesome variant of ASI will have to offer a way to deal with existing code.

'legacy' code here refers not to ancient, badly written code but simply to code using the current ASI, which is a legacy from the point of view of any revised ASI.

ASI reform would not have the intention to force coders to add semicolons everywhere, it would "merely" try to make the rules of syntax easier to understand for coders and less troublesome for language designers. The goal would be to avoid unnecessary syntax while also avoiding unnecessary worrying about how the mechanism works.

As a coder, you really don't want to add semicolons to avoid ASI traps, as in line 232 of your example

;["dependencies", "devDependencies"].forEach(function (d) {

As a language designer, there are more than enough issues to resolve, without having to watch your back wondering how ASI will interact with new grammar.

Hope that clears up things a little? Claus

(just another JS coder with language design and tool building interests) libraryinstitute.wordpress.com/about

# François REMY (13 years ago)

Good minifiers will not have a problem to perform ASI themselves.

Browsers are not intended to check the code that is sent to them. They should try to compile and run the code as fast as possible, and not try to do code analytics for the developer. If you want to perform code analytics, use an external compiler (or IDE).

From: Mike Ratcliffe Sent: Monday, April 18, 2011 4:37 PM To: es-discuss at mozilla.org Subject: Re: Automatic Semicolon Insertion: value vs cost; predictabilityandcontrol; alternatives

Jorge, I would opt in for warnings e.g. if I planned on minifying my web app in the future. Most web apps will burn in hell if they are missing semicolons when you minify them.

On 04/18/2011 02:42 PM, Jorge wrote: On 18/04/2011, at 13:10, Peter van der Zee wrote: On Mon, Apr 18, 2011 at 12:34 PM, Jorge <jorge at jorgechamorro.com> wrote:

  What am I missing ?

As far as the directive goes, they are opt-in. Old code won't be opting in. Other than that they have the same issues as "use strict" might have.

But why would anyone want to opt-in to get warnings or even worse syntax errors for code like this that depends on ASI, where ASI is helping out :

"die in hell ASI, i hate you with the fiery passion of a thousand burning suns."; // ^^^^^^^^^ the anti-ASI directive ^^^^^^^^^^

a= b c= d function e () { f() }

*** Warning missing semicolon @ line #4,5 *** Warning missing semicolon @ line #5,5 *** Warning missing semicolon @ line #6,19

Or even worse, halt the program with a:

Syntax error missing semicolon @ line #4,5

?

I understand that it would be quite interesting to get a warning/error in this case:

a= b (c= d)();

...only that there's no ASI in this case !

# Lasse Reichstein (13 years ago)

On Mon, 18 Apr 2011 14:42:21 +0200, Jorge <jorge at jorgechamorro.com> wrote:

I understand that it would be quite interesting to get a warning/error
in this case:

a= b (c= d)();

...only that there's no ASI in this case !

... and that's actually a very relevant point. Errors with ASI comes (in my experience) exclusively in cases where there is a newline that was intended to end the statement, but where it actually didn't - i.e., where ASI does not insert a semicolon.

No amount of warning about automatically inserted semicolons will help these cases, which are indistinguishable from correctly wrapped code.

The only thing that will help is for the writer to not rely on ASI, and therefore treat every line not ending in a semicolon as a wrapped
statement. If that wouldn't be correct, insert the semicolon manually.

# Brendan Eich (13 years ago)

On Apr 17, 2011, at 6:07 PM, Garrett Smith wrote:

On 4/17/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 17, 2011, at 10:52 AM, Claus Reinke wrote: [TLDR] ASI is not going to be removed. I don't know why you think it could be.

Why not? Iif developers would stop doing that then eventually, can't it be removed?

Define "eventually". We can't remove 'with' from ES5 + reality JS on any definite schedule. "Developers" are not a unitary audience with the same time horizons. Much JS on the web is unmaintained. It may or may not replaced for years or decades.

This is all kind of beside the point. We cannot remove 'with' or ASI from ES5 + reality, not-opted-into-Harmony JS as specified by ECMA-262 in any edition in sight.

Let's rediscuss when we can. Until then it's a waste of time.

This is not to say it can't or won't happen. It may but it could take longer than you care to wait.

# Brendan Eich (13 years ago)

On Apr 17, 2011, at 6:44 PM, Dmitry A. Soshnikov wrote:

The question is not that it's hard or not. IMO an explicit semicolon is really a syntactic noise in the language. In goes to the early era of C/Pascal, etc.

No, much earlier. Algol 60 used semicolons as separators (not terminators), IIRC.

IMO, the semicolon should be inserted only when there is ambiguous syntactic construction, or, e.g. when you want to put more than one sentence in one line.

foo() bar()

is not subject to ASI right now. It is plausibly a deletion bug of some kind. In my experience it would be a mistake to error-correct it without the hint of bar() coming on a later line from foo(), which is what ES1-5.1 all specify.

Changing a future edition to allow this, i.e., to insert ; as an error correction, could be done. It would be a relaxation of the syntax rules. But I do not think it's a good idea, and it does not help to address the "negative space" effect ASI creates, where users expect newline to be significant. It doesn't address the restricted production problem either. The primary use-case there is

return am_i_overlong_and_the_return_value_or_a_later_expr_stmt();

# Brendan Eich (13 years ago)

On Apr 17, 2011, at 8:33 PM, Mike Ratcliffe wrote:

I remember going over a few hundred thousand lines of JavaScript and adding semicolons because I had decided to minify it. I also remember that for months I was receiving bug reports from sections of code where I had missed the semicolons.

Adding semicolons is a job for a reliable tool, no human in the loop required.

Yes, this means minifiers must parse. There are a number of JS parsers in JS out there. Do popular minifiers still not parse and insert semicolons (and remove newlines) as needed?

# Brendan Eich (13 years ago)

See www.mail-archive.com/[email protected]/msg05609.html and earlier posts in that thread, for where

no asi;

as a Harmony pragma was tossed out as possible syntax.

The agreement we seemed to reach was simply to have a way for programmers to disable ASI, not try a complex new-ASI plus "voting" (new and old must agree for a Harmony program to pass).

# Brendan Eich (13 years ago)

On Apr 18, 2011, at 12:24 AM, Garrett Smith wrote:

Implementations are motivated to get scripts working and conform to specs. How could Ecma encourage developers to stop using ASI? I initially thought that standard warnings in strict mode would help.

No. My earlier reply to your previous post pointed out how web content is unmaintained for stretches, or ages.

Jason also pointed out that warnings nag the user, not the developer -- blaming the wrong party does not help and it can hurt.

We had a "strict option" (SpiderMonkey's warning-enabling mode that predated ES5 strict by almost a decade) to warn about 'with' usage. We got only grief for it. We removed it (this is the only difference from our strict-warnings option and ES5 strict now, IIRC).

On the web, besides not breaking the web, and not going out of business as a minority-share browser vendor by tilting at windmills, it is crucial to avoid blaming or nagging the user.

# P T Withington (13 years ago)

On 2011-04-18, at 13:48, Brendan Eich wrote:

Do popular minifiers still not parse and insert semicolons (and remove newlines) as needed?

Only the broken ones! :)

# Brendan Eich (13 years ago)

On Apr 18, 2011, at 5:09 PM, Lasse Reichstein wrote:

On Mon, 18 Apr 2011 14:42:21 +0200, Jorge <jorge at jorgechamorro.com> wrote:

I understand that it would be quite interesting to get a warning/error in this case:

a= b (c= d)();

...only that there's no ASI in this case !

... and that's actually a very relevant point. Errors with ASI comes (in my experience) exclusively in cases where there is a newline that was intended to end the statement, but where it actually didn't - i.e., where ASI does not insert a semicolon.

No amount of warning about automatically inserted semicolons will help these cases, which are indistinguishable from correctly wrapped code.

Excellent point, which I keep making and which keeps being missed.

The problem is lack of ASI where people expect newline significance. It is not where ASI kicks in (in general).

A lesser problem is where a restricted production bites, notably return\n long_value_here(). I once got mail from an outraged jwz about this. I just mailed back "parenthesize with the open paren on the same line as return" but I knew that was not going to diminish jwz's wrath.

Given the primary problem is not ASI but its absence where users expect it due to mistakenly believing a newline is significant, one could argue the fix is not to ban ASI and tax everyone with writing lots of insignificant semicolons (in some opt-in mode hardly anyone would use, which would only crud up implementations' parser state spaces).

One could argue instead that we need more newline signfiicance.

With paren-free I avoided going this direction. I don't believe JS should have two many syntaxes in its standard, or too much variation. Losing head parens per

strawman:paren_free

is a relaxation of the current grammar that (I believe) won't cost much in real parser implementations (it was easy in Narcissus, modulo the fact [see above parenthetical] that the --paren-free option makes a new mode that requires testing both ways, something I failed to do several times!).

Anyway, thanks for reiterating this point. ASI can be blamed for some things, including luring users into expecting newlines to be significant outside of restricted productions and error correction scenarios.

But other users want significant newlines anyway, and the "negative space" problem could be addressed by thinking about newline significance more, not just retrenching to "no asi" and typing ; all the time.

# Brendan Eich (13 years ago)

On Apr 18, 2011, at 1:45 AM, Claus Reinke wrote:

The bigger problem is not the rule-space but the mixed significance and insignificance of line terminators. .. .. The line terminator having selective meaning due to ASI as an error correction procedure, and of course in restricted productions, creates an expectation that line terminators matter in general. However, going down that road leads to CoffeeScript, or (somewhat more conservatively due to the use of : at end of head forms, and a lot older) Python. It's a steep slippery slope.

I was trying to point out that CoffeeScript, Python, Ruby, .. all have subtly different systems, which in turn differ from Haskell's system, and others in that space.

No, your original post mentioned none of CoffeeScript, Python, or Ruby. You mentioned only Haskell.

It is not useful to throw them into a single pigeon hole.

I certainly did not throw Haskell into a single pigeonhole with CoffeeScript or Python. And my reply does not contain the word "Ruby" at all!

I do group CoffeeScript and Python together, while acknowledging their differences (please, no pedantic nit-picking on this point). They both have in common indentation-based block (or similar control or body) structure, with significant newlines.

The remainder or your reply shows that you are aware of that, so I am surprised that you start with this misleading statement!-)

What was misleading?

You injected Ruby and implied I put Haskell in the same category as CoffeeScript, Python and Ruby, cited above. It seems to me the shoe is on the other foot. Grump!

However, it would be useful to have a Javascript implementation of a Javascript parser and unparser (precisely reproducing the source). Then one could prototype such syntax modifications as source-to-source translators, perhaps even testing whether a rewrite changes parse-followed-by-unparse results. There are lots of parsers, and some pretty-printers, but I'd like it to be in Javascript (rather than in Haskell;-), so that some of you are likely to use it, and it needs to work for me on Windows. Or would you be happy if I used one of the Haskell libs for JS?-)

Narcissus is actively developed, Tachyon is another metacircular VM (we are joining forces between these two), there are more than a few others out there.

But they all lack the bottom-up grammar validation (modulo classic-ASI) that we need to be sure we're not just having fun hacking, with insufficient tests.

And blog posts seem to be more about trouble with ASI than about usefulness of ASI [2,4,5]. Beware negativity and confirmation biases here.

If did my references correctly, those posts were explanatory, in the direction: "you cannot avoid ASI, so you better understand how it works".

That's fine, but this "you better understand [ASI]" conclusion is a change from what you wrote previously. Your entire paragraph was:

"It is interesting that even the ES5 spec has no convincing ASI examples, only clarifying examples (7.9.2). And blog posts seem to be more about trouble with ASI than about usefulness of ASI [2,4,5]. So ASI as it stands in Javascript now does not only make life harder for programmers but for spec writers (and readers), too."

and I replied objecting to the conclusion from negatively biased blog posts that ASI makes life harder. It also makes life easier but there are no blog posts to cite.

ASI is relied on by tons of content, without complaints (or praise). It goes without notice when it works, which is often when a ; was left off where the formal grammar requires it.

Very likely. Coders may not even be aware when they rely on ASI.

My point, which seems to override your "ASI as it stands ... [makes] life harder for programmers [and] spec writers".

ASI makes some things easier (reduces costs of constructing JS programs quickly, or in a preferred ;-free style) and it makes other things harder. We can agree on this at least.

Is ASI a net loss across history and all developers? I doubt it, but it would be great to improve -- I have no doubt that it is not the "last, best" solution of its kind.

Still, ASI will be hard to get rid of, and a new system would need to interoperate within the spec and implementations, and in developers' minds.

I wasn't referring to the ASI spec. When browsing the list archives, I noticed that discussions of syntax tend to drift into parser issues, and sometimes, when contributors have almost settled on a suggestion, out of nowhere comes an ASI issue ("if we split this over two lines, it is going to be ambiguous").

Yes, that's true.

Or simple operator proposals such as modulo operators have to consider restricted productions.

Indeed, although it's questionable how many identifier-named infix operators we want to add.

My impression was that ASI concerns are hampering language development, because ASI doesn't operate separately - it is entwined with the grammar.

Language history always hampers development. ASI will die hard. I'm not sure more generalizing will help us get to "the next thing".

What we could use is some validated alternative that has no runtime migration error gotchas.

But even if the alternative has a nice specification, how to validate that it doesn't ruin people's coding patterns without knowing what those coding patterns are?

Using formal methods on the grammar including restricted productions, and ASI as an error correction algorithm.

In the end, TC39 (I'm thinking of Waldemar) will require that anyway.

I was thinking about non-standard increments

Ok, I agree that could work although convincing implementors to add pragmas, then convincing developers to use them, is hard. And without a formal approach we won't get to standardization, so I'd start with formal methods first.

Yes. My suggestion was to take the assumptions out of the game - no complicated analysis, just indentation. But one would have to deprecate first, have tools warn developers if they use coding patterns that are going to break, then offer transition help. A longish process..

And here, I think standardization would be required to get the bandwagon effects.

But since you wrote a nice post and seem motivated, I do want to encourage you to work out more details.

Thanks for the useful explanations. It might be worth having them on the wiki, as an ASI-non-proposal with rationale.

I'm a bit grumpy about the earlier pigeonhole accusations and Ruby injection, but let us hope we can get past that.

Really, though, the wiki doesn't need more mooting and what-ifs and "how Haskell does it" summaries. Let's try to formalize here, at least a bit, if we can. Otherwise we aren't going to get far.

# Isaac Schlueter (13 years ago)

On Mon, Apr 18, 2011 at 05:42, Jorge <jorge at jorgechamorro.com> wrote:

I understand that it would be quite interesting to get a warning/error in this case: a= b (c= d)(); ...only that there's no ASI in this case !

Jorge touches on the reason why the whole debate about ASI is a bit misguided, in my opinion.

The "} or \n ended a statement" ASI doesn't usually bite, except in the case of restricted productions, where it looks like it wouldn't end.

return      { a : 1 };

This is a situation that's extremely easy to avoid with a simple and easily lint-able rule: "return\n<expression> is an error".

However, is it reasonable to treat strike this ASI with the same hammer?

b.on("click", function () { alert("hi!") });

I don't think so.  No reasonable JavaScripter would be surprised that the alert line ended at the }.

Furthermore, it is not the existence of ASI, but rather the lack of it that causes problems even more frequently than the restricted productions issue.

var a = 1;    var b = 2;    var c = { foo : "bar" }    [a, b].forEach(alert); // cannot call method 'forEach' of undefined

A "disable ASI" pragma will not catch this error.  No ASI occurred!

In my years writing JavaScript, I can count on one finger the number of times that restricted production ASI has bitten me.  I immediately abandoned the Allman style in favor of a BSD-KNF brace-goes-with-the-start-thing style, and all was well with the world.  This no longer looked strange to me:

return {      foo : "bar"    };

I can also count on one finger the number of restricted productions where this is an issue: throw will issue a syntax error, since the expression is not optional, and named continue/break are very rare.

However, the "non-ASI" error above (where there is a \n followed by a [, (, +, etc) bites a lot, and is almost impossible to see when parsing the code with human eyes and brains.

The proliferation of ; is not a keyboard-tax.  It is an eyeball/brain-tax.  "Why complain about semicolons?  You don't even see them."  That's why.  Because I want to see the relevant parts of my code.  If you put them on every line, and worse, at the jagged-right edge of the line,

The options are to either:

  • use a lint step in your development process to catch these issues (and it's gonna have to be a pretty clever linter to know you didn't mean to do that)
  • adopt a style where such things jump out at you because they look wrong (as I have done with npm's leading semicolon/comma style), or
  • Be Very Careful about using array literals and parenthesized constructions.

I find linters to be somewhat unpleasant housepets, and thus do not keep them in my own home, though I of course respect their place when contributing to others' projects, and I have found that Being Very Careful is not sustainable on teams of 1 or more humans.

Any approach that does not handle the "non-ASI error" case is not a solution.

Every time a warning prints about something that was intended, the value of warnings is reduced.  That's a bad idea.  In the onclick ASI example above, printing a warning about the ASI in that code would be worse than useless.

If we are going to build a less wicked language, and the idea of a warning-generating pragma seems wise, then we ought to make every warning as relevant as possible, and not just support backwards-compatibility, but also do our best to support the intent of those using the language.  Here's a few situations where (in my opinion) warnings would be useful:

  1. Restricted production followed by \n and an expression which is not a method or function call, assignment, or function declaration.

// these would warn:    return    { a: b }    /////    return        10 + 5

  1. Expression followed by \n and a +, (, [, -, *, / at the same indention level.

// this would warn:    x = y    (c + d).print()    // this would not:    chain      ( [some, long, thing]      , [the, next, chain, link]      , cb )

# Isaac Schlueter (13 years ago)

On Mon, Apr 18, 2011 at 11:05, Brendan Eich <brendan at mozilla.com> wrote:

Given the primary problem is not ASI but its absence where users expect it due to mistakenly believing a newline is significant, one could argue the fix is not to ban ASI and tax everyone with writing lots of insignificant semicolons (in some opt-in mode hardly anyone would use, which would only crud up implementations' parser state spaces).

One could argue instead that we need more newline signfiicance.

Yes.  This is the sanest thing I've read in this thread.

How about this?

"use superasi"

Result:

  1. /\n\s*[[(+*/-]/ is a syntax error.  (Or should it silently do ASI here?  Not sure.)
  2. /;\s+\n/ is a syntax error. (No extraneous semicolons.)

This would enforce proper use of ASI, and turn off the problems where statements are confusingly not ended by \n.

(Starting a line with a "." should still be allowed, since that is not otherwise a valid construction, and thus not easily confused.)

Arguments for:

  1. Mostly backwards compatible, except in the case which everyone seems to agree is a language defect.
  2. Prevents the wtfs that are cited as being due to ASI.
  3. Encourages developers to know the language they're using.

Arguments against:

  1. Ew.  There aren't semicolons there.
# Brendan Eich (13 years ago)

On Apr 18, 2011, at 12:14 PM, Isaac Schlueter wrote:

[snip, and huzzah! ;-)]

Furthermore, it is not the existence of ASI, but rather the lack of it that causes problems even more frequently than the restricted productions issue.

var a = 1; var b = 2; var c = { foo : "bar" } [a, b].forEach(alert); // cannot call method 'forEach' of undefined

A "disable ASI" pragma will not catch this error. No ASI occurred!

Ding ding ding...

In my years writing JavaScript, I can count on one finger the number of times that restricted production ASI has bitten me. I immediately abandoned the Allman style in favor of a BSD-KNF brace-goes-with-the-start-thing style, and all was well with the world. This no longer looked strange to me:

return { foo : "bar" };

Right, and crock has made this point too. Style rule with substance (like it or not) backing it, better than mere (and endless) aesthetic arguments.

I can also count on one finger the number of restricted productions where this is an issue: throw will issue a syntax error, since the expression is not optional, and named continue/break are very rare.

However, the "non-ASI" error above (where there is a \n followed by a [, (, +, etc) bites a lot, and is almost impossible to see when parsing the code with human eyes and brains.

Yes, this is a good point too, I forgot to raise it. It is why people are told to start separate JS files which may end up in a concatenation with a lone ;. Thanks for raising it.

The proliferation of ; is not a keyboard-tax. It is an eyeball/brain-tax. "Why complain about semicolons? You don't even see them." That's why. Because I want to see the relevant parts of my code. If you put them on every line, and worse, at the jagged-right edge of the line,

I agree, but in a friendly spirit suggest typing ; is a tax too, however much lesser.

The options are to either:

  • use a lint step in your development process to catch these issues (and it's gonna have to be a pretty clever linter to know you didn't mean to do that)
  • adopt a style where such things jump out at you because they look wrong (as I have done with npm's leading semicolon/comma style), or
  • Be Very Careful about using array literals and parenthesized constructions.

I find linters to be somewhat unpleasant housepets, and thus do not keep them in my own home, though I of course respect their place when contributing to others' projects, and I have found that Being Very Careful is not sustainable on teams of 1 or more humans.

Agree emphatically. We have had similar experiences!

Any approach that does not handle the "non-ASI error" case is not a solution.

Every time a warning prints about something that was intended, the value of warnings is reduced. That's a bad idea. In the onclick ASI example above, printing a warning about the ASI in that code would be worse than useless.

Agreed. See my 'with' strict warning story.

If we are going to build a less wicked language, and the idea of a warning-generating pragma seems wise, then we ought to make every warning as relevant as possible, and not just support backwards-compatibility, but also do our best to support the intent of those using the language. Here's a few situations where (in my opinion) warnings would be useful:

  1. Restricted production followed by \n and an expression which is not a method or function call, assignment, or function declaration.

    // these would warn: return { a: b } ///// return 10 + 5

Some expressions that do not involve calling, assigning, or otherwise potentially having effects (declaring, where the declaration is a misplaced function expression) still may have effects (e.g. getters).

Your suggestion seems to be to avoid analysis and warn based only the syntax of the expression that follows on a separate line a bare, unterminated return. I think I agree, but I wanted to call this choice out for more discussion.

  1. Expression followed by \n and a +, (, [, -, *, / at the same indention level.

    // this would warn: x = y (c + d).print() // this would not: chain ( [some, long, thing] , [the, next, chain, link] , cb )

Indentation, yay. Necessary in your view, or could you just ignore everything except the separation by a line terminator?

# Isaac Schlueter (13 years ago)

On Mon, Apr 18, 2011 at 12:22, Brendan Eich <brendan at mozilla.com> wrote:

I agree, but in a friendly spirit suggest typing ; is a tax too, however much lesser.

True, I overstated.  It is a keyboard tax.  But (at least in my experience) I tend to type code in a moment, and then read it for the rest of my life.  I'd gladly pay a keyboard tax that lowered the cognitive burden of maintaining code.

Some expressions that do not involve calling, assigning, or otherwise potentially having effects (declaring, where the declaration is a misplaced function expression) still may have effects (e.g. getters).

Your suggestion seems to be to avoid analysis and warn based only the syntax of the expression that follows on a separate line a bare, unterminated return. I think I agree, but I wanted to call this choice out for more discussion.

Indentation, yay. Necessary in your view, or could you just ignore everything except the separation by a line terminator?

Thinking about this a bit more, I think maybe this whole suggestion is a bad idea.  Forget I said anything.  Significant linebreaks are one thing, but significant indentation is deeply problematic.

Maybe the solution is a way to make return's expression non-optional? If you want to return nothing, then you'd return undefined or return null.

In any event, I think spending a lot of effort trying to figure out the best way to remove ASI from the language is an unprofitable path. That energy would be better spent trying to figure out how best to remove just the problems with the current ASI implementation, and not throw out the baby with the bathwater.

# Brendan Eich (13 years ago)

On Apr 18, 2011, at 12:32 PM, Isaac Schlueter wrote:

Indentation, yay. Necessary in your view, or could you just ignore everything except the separation by a line terminator?

Thinking about this a bit more, I think maybe this whole suggestion is a bad idea. Forget I said anything. Significant linebreaks are one thing, but significant indentation is deeply problematic.

I agree, which is why I drew the line in paren-free about JS always being a curly-brace language.

Maybe the solution is a way to make return's expression non-optional? If you want to return nothing, then you'd return undefined or return null.

Drag of a migration tax but it is an early error. Will ponder further.

In any event, I think spending a lot of effort trying to figure out the best way to remove ASI from the language is an unprofitable path. That energy would be better spent trying to figure out how best to remove just the problems with the current ASI implementation, and not throw out the baby with the bathwater.

+1, or more.

The haters are gonna hate when they really should look deeper. There is not a simple good vs. evil battle between using ; scrupulously or using ASI. If you assume ; is the one true way to terminate statements, then there's no debate at all.

However, given the reality of ASI, in practice there are two ways to terminate statements. Then the question becomes, what is more usable, optionally turning off ASI, or under prior opt-in to Harmony, improving ASI?

I think it's plausible that ASI can be improved, perhaps along lines you first suggested. E.g.,

return useless expression here;

becomes an error in Harmony. Or we force return <expr>; and ban return; (but seems extreme to me at the moment).

# Bob Nystrom (13 years ago)

However, given the reality of ASI, in practice there are two ways to terminate statements. Then the question becomes, what is more usable, optionally turning off ASI, or under prior opt-in to Harmony, improving ASI?

I would love to be able to ditch my ";" in JS. There are other languages that use them optionally without much difficulty (Scala and Go come to mind). Would it be possible to under harmony move to semantics more similar to those?

My understanding is that JS thinks "ignore all newlines unless it turns out I need one" while other languages think "use all newlines unless it turns out I can't". The latter seems like a saner response since most newlines are intended to be statement boundaries.

Changing JS to treat all newlines as significant would address nasty cases like:

var a = 1 var b = 2 var c = { foo : "bar" } // by default would assume the newline here is significant [a, b].forEach(alert)

The semicolon elision rules from what I've seen are a good bit simpler than the current insertion ones: If a token that can't end an expression or statement precedes a newline, eat the newline.

# Mike Samuel (13 years ago)

2011/4/18 Bob Nystrom <rnystrom at google.com>:

The semicolon elision rules from what I've seen are a good bit simpler than the current insertion ones: If a token that can't end an expression or statement precedes a newline, eat the newline.

If I understand semicolon elision, then

myLabel: for (;;) {}

would be interpreted as

myLabel: ; for (;;) {}

That case should fail fast in all cases with undefined labels, as long as eval can't be used to break/continue to labels not defined in the evaled code, and could be addressed with a rule similar to insertions rule against fabricating no-ops.

But it definitely does change behavior around mixed unary/binary operators in ways that affect current coding practices:

var x = "foo" + "bar"

though lint tools' unused value warnings should catch this particular example.

# Garrett Smith (13 years ago)

On 4/18/11, Claus Reinke <claus.reinke at talk21.com> wrote:

The only places where semicolons are ever used in the Node.js package manager are in the 'for' loops headers and at the beginning of the lines that would be interpreted incorrectly because of the lack of the semicolon at the end of the previous line - for example see: isaacs/npm/blob/master/lib/utils/read-json.js#L320-L336

There are no semicolons at the end of lines at all.

Also the commas in multi-line arrays and parameter lists are always written at the beginning of lines so Node also depends on the ASI not inserting semicolons if the next line starts with a comma or other operator.

Thanks. It is useful to have such concrete examples.

If you can think it, there is a porn for it.

]...]

[aside: it would be nice to know who the committee members are, and what the process is for starting an official proposal]

Who are the committee members?

For the specific case of ASI, Brendan has explained the issues

esdiscuss/2011-April/013794

Restricted productions are the most benign cases. How ASI changes program behavior WRT unrestricted productions is bigger problem.

In brief, there is no way simply to take away ASI, and any attempt to introduce a less troublesome variant of ASI will have to offer a way to deal with existing code.

The number of developers advocating ASI as a "best practice" can't be stopped either.

'legacy' code here refers not to ancient, badly written code but simply to code using the current ASI, which is a legacy from the point of view of any revised ASI.

Right.

ASI reform would not have the intention to force coders to add semicolons everywhere, it would "merely" try to make the rules of syntax easier to understand for coders and less troublesome for language designers. The goal would be to avoid unnecessary syntax while also avoiding unnecessary worrying about how the mechanism works.

As a coder, you really don't want to add semicolons to avoid ASI traps, as in line 232 of your example

;["dependencies", "devDependencies"].forEach(function (d) {

Exactly. Because otherwise, that array might be property accessor.

Why do I want to have to worry about what might have been omitted? No, I want to worry about what the code says. I consider a statement terminator to be just that; when I read it, I see end of statement. don't want to read beginning of statement preceeded by empty statement.

As a language designer, there are more than enough issues to resolve, without having to watch your back wondering how ASI will interact with new grammar.

Exactly. Multiline comments add extra problems.

Hope that clears up things a little?

Totally agree with you on this one.

# Isaac Schlueter (13 years ago)

On Mon, Apr 18, 2011 at 22:52, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

On 4/18/11, Claus Reinke <claus.reinke at talk21.com> wrote:

The only places where semicolons are ever used in the Node.js package manager are in the 'for' loops headers and at the beginning of the lines that would be interpreted incorrectly because of the lack of the semicolon at the end of the previous line - for example see: isaacs/npm/blob/master/lib/utils/read-json.js#L320-L336

There are no semicolons at the end of lines at all.

npm author here.

If you can think it, there is a porn for it.

It's not syntax porn, I assure you. This style is more easily scanned. The important tokens are along the left edge, which forms a straight line with logical and orderly breaks, and those tokens are not overused, so their presence or absence is very noticeable. The right-edge is jagged, and when every line has a ";", the brain tends to filter out their presence, making it hard to notice their absence.

Maybe your brain doesn't have the same pattern-matching style as mine, of course. But, willy nilly, I have the brain I do, so I made npm look the way it does, so that it's easier for me to avoid and find bugs with a minimum of friction.

This is all to say, relying on ASI is (sometimes) a sane and pragmatic decision, and not difference for difference's sake.

Restricted productions are the most benign cases. How ASI changes program behavior WRT unrestricted productions is bigger problem.

Can you provide examples of the sort of unrestricted productions you're referring to, where unexpected "semicolon insertion" changes program behavior? In my experience, it is the lack of ASI that usually causes problems with unrestricted productions.

In brief, there is no way simply to take away ASI, and any attempt to introduce a less troublesome variant of ASI will have to offer a way to deal with existing code.

The number of developers advocating ASI as a "best practice" can't be stopped either.

Developers using ASI and advocating its use should be taken as a datapoint for our discussion of the language, don't you think? Regardless of whether or not you agree with them, breaking their model drastically will only result in a lower adoption of Harmony/es-next/opt-in-whatever.

As a coder, you really don't want to add semicolons to avoid ASI traps, as in line 232 of your example

;["dependencies", "devDependencies"].forEach(function (d) {

Yes. I've never started a line with [ with the intent of it being a property access. I do sometimes start a new line with ( for a function call, if the list of arguments is long, but it's a practice I'd gladly change in exchange for saner statement ending rules.

Why do I want to have to worry about what might have been omitted? No, I want to worry about what the code says. I consider a statement terminator to be just that; when I read it, I see end of statement. don't want to read beginning of statement preceeded by empty statement.

In that case, what the code "says" is exactly the same, whether the semicolon is on the next line or not.

There is nothing fundamental or essential about ; being a "terminator" rather than a "separator". In fact, that's what it actually is in EcmaScript - a statement separator. (Likewise in sh/bash/zsh, Perl, Ruby, Erlang, and many others.) In Java, PHP, C, and C++, it is a terminator. This isn't c-devel or java-devel, it's es-devel.

The "use superasi" pragma would effectively make [ and ( treated a bit like restricted productions, in that a line break before them always terminates the statement. I'm not sold on this idea, necessarily, but I think it's important to consider options other than how to remove ASI from the language.

Exactly. Multiline comments add extra problems.

How so?

# Brendan Eich (13 years ago)

On Apr 18, 2011, at 10:52 PM, Garrett Smith wrote:

[aside: it would be nice to know who the committee members are, and what the process is for starting an official proposal]

Who are the committee members?

Ecma TC39 members, a bunch of us self-identify here.

Ecma is a standards body, you pay for a voting (so-called "ordinary") membership. It is not open to official proposals from non-members since non-members are not bound by IPR covenants. (Don't shoot the messenger.)

As I keep saying, Haskell SI is not going to replace ASI since migrating code changes meaning without error (early or runtime). Let's discuss here and formalize something without such a migration hazard. There's no need to be "official" until we have a design better in hand.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 8:39 AM, Isaac Schlueter wrote:

This style is more easily scanned. The important tokens are along the left edge, which forms a straight line with logical and orderly breaks, and those tokens are not overused, so their presence or absence is very noticeable. The right-edge is jagged, and when every line has a ";", the brain tends to filter out their presence, making it hard to notice their absence.

This is accurate in my experience. Even experienced semicolon users sometimes leave a few out, and the lack is hard to see.

Restricted productions are the most benign cases. How ASI changes program behavior WRT unrestricted productions is bigger problem.

Can you provide examples of the sort of unrestricted productions you're referring to, where unexpected "semicolon insertion" changes program behavior? In my experience, it is the lack of ASI that usually causes problems with unrestricted productions.

Must be what Garrett means, since ASI is only error correction, plus of course built-into-the-grammar restricted productions.

So ASI does not change program behavior from non-error behavior A to non-error behavior B. It instead suppresses early SyntaxError with successful evaluation, in a deterministic way.

So any statement of the form "... ASI changes program behavior WRT unrestricted productions is bigger problem" is simply misstated.

Again, it is the expectation of newline significance where none exists, where no error is corrected by ASI, that leads people astray. This is worth working to fix, or mitigate, provided migration works well.

# John Tamplin (13 years ago)

On Tue, Apr 19, 2011 at 11:52 AM, Brendan Eich <brendan at mozilla.com> wrote:

So ASI does not change program behavior from non-error behavior A to non-error behavior B. It instead suppresses early SyntaxError with successful evaluation, in a deterministic way.

Is that true even in the "return \n expression" case? It certainly seems to be not an error before or after ASI, yet the result is quite different.

# Jorge (13 years ago)

On 18/04/2011, at 16:37, Mike Ratcliffe wrote:

Jorge, I would opt in for warnings e.g. if I planned on minifying my web app in the future. Most web apps will burn in hell if they are missing semicolons when you minify them.

Indeed, for some minifiers it's a must.

These minifiers avoid (understandably) the hassle/expensiveness of building a parse tree and rely on (clever!) tricks that in turn require the programmer to put every semicolon in the source text, as if post-ASI, explicitly.

But "a=b\nc=d\n".length is equal to "a=b;c=d;".length and that proves that it can be minified just as much without the explicit semicolons.

# Jorge (13 years ago)

On 17/04/2011, at 19:44, Dmitry A. Soshnikov wrote:

(...) Since usually, a programmer puts a one logical sentences per line -- for what you need additional statement/expression termination besides the new line itself? The useless actions in this case can be compared with masochism. (...)

AFAIK, the parser is mostly 'greedy' and keeps parsing and skipping over \n trying to compose the longest sentence that makes sense.

This is a feature (IMO) that allows us to break complex/longish statements/expressions onto several lines, for better readability, a feature I wouldn't want to ditch by making \n a statement/expression terminator, except, perhaps, in some very few special situations.

# Bob Nystrom (13 years ago)

On Mon, Apr 18, 2011 at 7:46 PM, Mike Samuel <mikesamuel at gmail.com> wrote:

If I understand semicolon elision, then

myLabel: for (;;) {}

would be interpreted as

myLabel: ; for (;;) {}

I'm still learning to details of the ES grammar, but I didn't think there were cases where a ";" is valid after a ":". Object literals need a value after it, labels need a subsequent statement, and switch cases need a statement, right?

If that's the case, then you'd elide a newline following ":" (i.e. not treat the newline like a ";") just like you would following a binary operator. You're right that ":" is probably one of the trickier places to get this right since it's used in so many contexts. Full disclosure: I've never actually used labels in practice, so I may be missing something obvious here.

But it definitely does change behavior around mixed unary/binary operators in ways that affect current coding practices:

var x = "foo"

  • "bar"

That's true. I believe in languages that default to treating newlines as significant, the style is to put a binary operator at the end of the line and not at the beginning of the next one. Given the above code, I don't think it's even clear what the programmer intended. We can only assume they wanted the binary operator because unary plus has no side-effect.

A change like I'm describing here would definitely affect existing code. Leading commas are another place where this would need attention:

var a = 1 , b = 2 , c = 3

In that case, since there is no "unary prefix comma" operator we could ignore the newline before it, but once we start adding that many special cases, we may be ending up with something as complex as the current ASI rules.

though lint tools' unused value warnings should catch this particular

example.

Agreed, tooling can help a lot here. I don't underestimate how much work it would be to get to a new behavior for how newlines are handled. It would be a huge chore. But I do think the place we could end up after doing so would be simpler, less error-prone, and more readable.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 9:36 AM, John Tamplin wrote:

On Tue, Apr 19, 2011 at 11:52 AM, Brendan Eich <brendan at mozilla.com> wrote: So ASI does not change program behavior from non-error behavior A to non-error behavior B. It instead suppresses early SyntaxError with successful evaluation, in a deterministic way.

Is that true even in the "return \n expression" case? It certainly seems to be not an error before or after ASI, yet the result is quite different.

I meant to exclude the restricted productions in writing "plus of course built-into-the-grammar restricted productions". If you include the restricted productions, which are part of the grammar, then there is only one way to parse "return \n expression", including a semicolon insertion.

This doesn't make the restricted production case any less of a bitter pill to swallow when writing long returns. But there's only one way to parse such a sentence. ASI as in "inserting a semicolon to recover from an error" did not change "return expression" to "return; expression".

Rather, the restricted production kept the parser from even considering the expression as the return value, and the automatic insertion fixed up the lack of a properly terminated return statement.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 10:41 AM, Bob Nystrom wrote:

On Mon, Apr 18, 2011 at 7:46 PM, Mike Samuel <mikesamuel at gmail.com> wrote: If I understand semicolon elision, then

myLabel: for (;;) {}

would be interpreted as

myLabel: ; for (;;) {}

I'm still learning to details of the ES grammar, but I didn't think there were cases where a ";" is valid after a ":". Object literals need a value after it, labels need a subsequent statement, and switch cases need a statement, right?

No, labeling an empty statement is permitted by the grammar:

js> L:;

js>

Same as in C, but useless without goto. Still, back in the old days we did not want to add complexity to limit labels to apply only to non-empty statements. You can break from the then part of a labeled if, in lieu of a downward goto, e.g.

# Mike Samuel (13 years ago)

2011/4/19 Bob Nystrom <rnystrom at google.com>:

On Mon, Apr 18, 2011 at 7:46 PM, Mike Samuel <mikesamuel at gmail.com> wrote:

If I understand semicolon elision, then myLabel: for (;;) {}

would be interpreted as

myLabel: ; for (;;) {}

I'm still learning to details of the ES grammar, but I didn't think there were cases where a ";" is valid after a ":". Object literals need a value after it, labels need a subsequent statement, and switch cases need a statement, right?

From section 12.3

EmptyStatement : ;

# Isaac Schlueter (13 years ago)

On Tue, Apr 19, 2011 at 09:57, Jorge <jorge at jorgechamorro.com> wrote:

Most web apps will burn in hell if they are missing semicolons when you minify them.

Indeed, for some minifiers it's a must.

Which minifiers?

Closure, yuicompressor, jsmin, packer, and uglify all handle ASI without so much as a complaint. In fact, with YUICompressor and JSMin, at least, when you omit semicolons, you end up with more easily debuggable minified code, since the line numbers in stack traces are actually helpful. (No "line 1, char 82,343" to deal with.)

These minifiers avoid (understandably) the hassle/expensiveness of building a parse tree and rely on (clever!) tricks that in turn require the programmer to put every semicolon in the source text, as if post-ASI, explicitly.

I don't believe that "those minifiers" actually get much use. They're hideously broken, and there is a huge selection of competent minifiers that do actually minify JavaScript properly.

# Garrett Smith (13 years ago)

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 19, 2011, at 8:39 AM, Isaac Schlueter wrote:

This style is more easily scanned. The important tokens are along the left edge, which forms a straight line with logical and orderly breaks, and those tokens are not overused, so their presence or absence is very noticeable. The right-edge is jagged, and when every line has a ";", the brain tends to filter out their presence, making it hard to notice their absence.

This is accurate in my experience. Even experienced semicolon users sometimes leave a few out, and the lack is hard to see.

Restricted productions are the most benign cases. How ASI changes program behavior WRT unrestricted productions is bigger problem.

Can you provide examples of the sort of unrestricted productions you're referring to, where unexpected "semicolon insertion" changes program behavior? In my experience, it is the lack of ASI that usually causes problems with unrestricted productions.

Must be what Garrett means, since ASI is only error correction, plus of course built-into-the-grammar restricted productions.

So ASI does not change program behavior from non-error behavior A to non-error behavior B. It instead suppresses early SyntaxError with successful evaluation, in a deterministic way.

I don't mean to annoy by repeating the same things, but here goes: Is () Grouping Operator or Arguments? Is [] ArrayLiteral or Property Accessor? Or do these cases depend on the preceding token?

I'll take your prose on with an example Non error B behavior, b.js: (function() { /.../ });

Non-error A behavior, a.js: var MyWidget = function(){ this.name = "mike"; }

Now concatenate a.js and b.js and you have: var MyWidget = function(){ this.name = "mike"; }(function() {});

Which makes MyWidget undefined and sets window.name to "mike". That's all fine if you you notice it right away. But what if MyWidget gets called in a callback of some sorts, and that callback never fires? Well, sure, you might say that would be a lazy developer and faulty QA, but IMO it would be much nicer to be fail fast.

That example, BTW, is in jibbering.com/faq/notes/code-guidelines/asi.html as is the example with array literal/square bracket property accessor.

mail.mozilla.org/htdig/es-discuss/2010-July/011600.html

So any statement of the form "... ASI changes program behavior WRT unrestricted productions is bigger problem" is simply misstated.

See above.

Again, it is the expectation of newline significance where none exists, where no error is corrected by ASI, that leads people astray. This is worth working to fix, or mitigate, provided migration works well.

I don't understand what you mean.

# Mike Samuel (13 years ago)

2011/4/19 Bob Nystrom <rnystrom at google.com>:

On Mon, Apr 18, 2011 at 7:46 PM, Mike Samuel <mikesamuel at gmail.com> wrote:

var x = "foo"    + "bar"

That's true. I believe in languages that default to treating newlines as significant, the style is to put a binary operator at the end of the line and not at the beginning of the next one. Given the above code, I don't think it's even clear what the programmer intended. We can only assume they wanted the binary operator because unary plus has no side-effect. A change like I'm describing here would definitely affect existing code. Leading commas are another place where this would need attention:

This seems a serious problem to me.

There is a large population of EcmaScript developers who develop in Java on the server, and ES on the client.

Widely accepted java style guidelines mandate line-breaking with the operator the on-next line. www.oracle.com/technetwork/java/codeconventions-136091.html#248 "

  • Break after a comma.
  • Break before an operator.

...

longName1 = longName2 * (longName3 + longName4 - longName5) + 4 * longname6; // PREFER " is not contradictory because comma is not an operator in Java. The second is not as widely followed as the first, but is still fairly widely followed among java programmers.

I think there are a large number of programmers who, because of those java style guidelines and the way ASI works, write javascript breaking before operators except for comma operators.

www.google.com/codesearch?q=\x20\x20\x20[%2B-][^%2B-]+lang%3Ajavascript shows numerous examples.

If true, this is not just a matter of code backwards compatibility, but of porting programmers.

# Jorge (13 years ago)

On 19/04/2011, at 19:52, Isaac Schlueter wrote:

On Tue, Apr 19, 2011 at 09:57, Jorge <jorge at jorgechamorro.com> wrote:

Most web apps will burn in hell if they are missing semicolons when you minify them.

Indeed, for some minifiers it's a must.

Which minifiers?

I don't know, the ones that make "web apps burn in hell if they are missing semicolons".

(...) you end up with more easily debuggable minified code, since the line numbers in stack traces are actually helpful. (No "line 1, char 82,343" to deal with.)

Great, I like that too, but (in production) most sites serve ~ illegible JS on purpose, I think.

I don't believe that "those minifiers" actually get much use. They're hideously broken, and there is a huge selection of competent minifiers that do actually minify JavaScript properly.

jsmin.c is all I've ever used and all I've ever needed. It's fast and effective.

# Bob Nystrom (13 years ago)

I think there are a large number of programmers who, because of those java style guidelines and the way ASI works, write javascript breaking before operators except for comma operators.

www.google.com/codesearch?q=\x20\x20\x20[%2B-][ ^%2B-%5D+lang%3Ajavascript shows numerous examples.

If true, this is not just a matter of code backwards compatibility, but of porting programmers.

That's unfortunate. There's another option: Python-style. In Python, I believe newlines are ignored within a parenthesized expression. In JS, that would mean:

var a = 1

  • 2 // a = 1

var a = (1

// a = 3

If a large number of long lines happen to occur within function calls, like...

log(longThing1 + otherLongThing + lastLongThing);

...then that will do the right thing without having to add more () just to handle the newline.

# Isaac Schlueter (13 years ago)

On Tue, Apr 19, 2011 at 11:02, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote: I don't mean to annoy by repeating the same things, but here goes: Is () Grouping Operator or Arguments? Is [] ArrayLiteral or Property Accessor? Or do these cases depend on the preceding token?

They depend on the preceding token, and they cause a preceding \n to be elided. That is the problem with JavaScript's statement termination rules with respect to lines that start with +, /, *, -, (, or [.

Now concatenate a.js and b.js and you have: var MyWidget = function(){  this.name = "mike"; }(function() {});

That error isn't caused by ASI. Disabling ASI won't prevent that error.

That error is caused by the lack of ASI. It's caused by the \n being elided when the next line starts with a (.

So any statement of the form "... ASI changes program behavior WRT unrestricted productions is bigger problem" is simply misstated.

See above.

ASI didn't change the program behavior. ASI didn't happen in that example.

Newline elision changed the program behavior.

# John Tamplin (13 years ago)

On Tue, Apr 19, 2011 at 2:53 PM, Bob Nystrom <rnystrom at google.com> wrote:

I think there are a large number of programmers who, because of those

java style guidelines and the way ASI works, write javascript breaking before operators except for comma operators.

www.google.com/codesearch?q=\x20\x20\x20[%2B-][ ^%2B-%5D+lang%3Ajavascript shows numerous examples.

If true, this is not just a matter of code backwards compatibility, but of porting programmers.

That's unfortunate. There's another option: Python-style. In Python, I believe newlines are ignored within a parenthesized expression. In JS, that would mean:

var a = 1

  • 2 // a = 1

var a = (1

// a = 3

Ok, so you are advocating that adding extra parens is less typing and less prone to error than adding semicolons?

# Bob Nystrom (13 years ago)

var a = 1

  • 2 // a = 1

var a = (1

// a = 3

Ok, so you are advocating that adding extra parens is less typing and less prone to error than adding semicolons?

Yes, adding extra parens where needed and omitting ";" is less typing. To verify, I just went through a bunch of JS that strictly follows Google's style guide (which means lots of long names and a maximum of 80 chars wide, so plenty of line continuations). My results:

  • Lines of code: 1,312
  • ";", which could all be omitted: 388
  • Places where "()" would need to be added to preserve current behavior: 0

So 388 fewer characters to type.

I'm not sure what you mean by "less prone to error". Relative to mandatory semicolons or current ASI semantics?

# Isaac Schlueter (13 years ago)

On Tue, Apr 19, 2011 at 11:53, Jorge <jorge at jorgechamorro.com> wrote:

Which minifiers? I don't know, the ones that make "web apps burn in hell if they are missing semicolons".

Until someone can point to an actual minifier that's actually affected by this, I think the whole "minification requires semicolons" argument is baseless fud.

Not trying to impugn you specifically. It takes a lot of work to un-believe something that "everybody knows" :)

# Garrett Smith (13 years ago)

On 4/19/11, Isaac Schlueter <i at izs.me> wrote:

On Tue, Apr 19, 2011 at 11:02, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote: I don't mean to annoy by repeating the same things, but here goes: Is () Grouping Operator or Arguments? Is [] ArrayLiteral or Property Accessor? Or do these cases depend on the preceding token?

They depend on the preceding token, and they cause a preceding \n to be elided. That is the problem with JavaScript's statement termination rules with respect to lines that start with +, /, *, -, (, or [.

Now concatenate a.js and b.js and you have: var MyWidget = function(){ this.name = "mike"; }(function() {});

That error isn't caused by ASI. Disabling ASI won't prevent that error.

That error is caused by the lack of ASI. It's caused by the \n being elided when the next line starts with a (.

So any statement of the form "... ASI changes program behavior WRT unrestricted productions is bigger problem" is simply misstated.

See above.

ASI didn't change the program behavior. ASI didn't happen in that example.

Newline elision changed the program behavior.

No, MyWidget = function(){} was not explicitly terminated by a semicolon. The end of the input stream is reached and a semicolon is inserted.

"Newline elision" did not change behavior.

Concatenation of a.js and b.js results in behavior that is not the same as when a.js and b.js are in separate files.

In a.js, there was a missing semicolon after the FunctionExpression. On its own, a.js does what was wanted of it.

File b.js contains what the author wanted as a grouping operator and it works fine as b.js alone. But concatenating a.js + b.js to one file, the result is different behavior. That's a problem.

The changed behavior problem is avoidable, by beginning files with ; (as already mentioned in this thread). Dojo.js does that, for example, to avoid the problem of changed behavior when runing through shrinksafe or yuicompressor (both of which concatenate and minify) thus avoiding the potential for different program behavior.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 11:02 AM, Garrett Smith wrote:

I don't mean to annoy by repeating the same things, but here goes: Is () Grouping Operator or Arguments? Is [] ArrayLiteral or Property Accessor? Or do these cases depend on the preceding token?

It depends on the context, but ASI does not change the context.

I'll take your prose on with an example Non error B behavior, b.js: (function() { /.../ });

Non-error A behavior, a.js: var MyWidget = function(){ this.name = "mike"; }

ASI happens here only if a.js is not concatenated.

Now concatenate a.js and b.js and you have: var MyWidget = function(){ this.name = "mike"; }(function() {});

Yes. We've discussed this. It's not a change in semantics due to the error-correction aspect of ASI. There is no ASI on this concatenated input!

Which makes MyWidget undefined and sets window.name to "mike". That's all fine if you you notice it right away. But what if MyWidget gets called in a callback of some sorts, and that callback never fires? Well, sure, you might say that would be a lazy developer and faulty QA, but IMO it would be much nicer to be fail fast.

It's a hazard for sure, but for the umpteenth time, it is not due to ASI!

That example, BTW, is in jibbering.com/faq/notes/code-guidelines/asi.html as is the example with array literal/square bracket property accessor.

mail.mozilla.org/htdig/es-discuss/2010-July/011600.html

You are not demonstrating what you seem to think you're demonstrating.

So any statement of the form "... ASI changes program behavior WRT unrestricted productions is bigger problem" is simply misstated.

See above.

No, you're not demonstrating a change of program behavior due to ASI on the result of the concatenation. You are showing a change of behavior due to concatenation relieving an error that ASI corrects in the case where a.js is not the first part of a concatenation where b.js follows.

Again, it is the expectation of newline significance where none exists, where no error is corrected by ASI, that leads people astray. This is worth working to fix, or mitigate, provided migration works well.

I don't understand what you mean.

See above. People read

var MyWidget = function(){ this.name = "mike"; } (function() { /.../ });

and think the newline after the } on its own line (line 3 above) terminates the var statement. It's an easy mistake to make given how similar code where the next line does not begin with (, [, +, etc. works. Especially if the next line begins with a keyword.

# John Tamplin (13 years ago)

On Tue, Apr 19, 2011 at 6:22 PM, Brendan Eich <brendan at mozilla.com> wrote:

Yes. We've discussed this. It's not a change in semantics due to the error-correction aspect of ASI. There is no ASI on this concatenated input!

Yes, but given that ASI encourages developers to omit semicolons except when absolutely required, it is nevertheless a consequence of ASI. If ASI did not exist, the would be no missing semicolons.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 3:08 PM, Garrett Smith wrote:

On 4/19/11, Isaac Schlueter <i at izs.me> wrote:

ASI didn't change the program behavior. ASI didn't happen in that example.

Isaac is correct.

Newline elision changed the program behavior.

Or newlines being insignificant whitespace, let's say.

No, MyWidget = function(){} was not explicitly terminated by a semicolon. The end of the input stream is reached and a semicolon is inserted.

Not in the concatenation of a.js and b.js. Please attend to your own example.

Only if a.js is processed as a Program (the grammar's goal nonterminal), in which case, yes, ASI kicks in, and no, there is no subsequent ( or [ or similar input to cause trouble.

"Newline elision" did not change behavior.

Concatenation of a.js and b.js results in behavior that is not the same as when a.js and b.js are in separate files.

That is true, but now you are changing the terms of the debate. The claim that ASI does not affect non-error semantics applies for any given Program -- not for Program X and Program Y.

What you describe is a hazard, we've covered it. But your argument foundered on ASI applying differently on two different programs. Of course that can happen.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 3:27 PM, John Tamplin wrote:

On Tue, Apr 19, 2011 at 6:22 PM, Brendan Eich <brendan at mozilla.com> wrote: Yes. We've discussed this. It's not a change in semantics due to the error-correction aspect of ASI. There is no ASI on this concatenated input!

Yes, but given that ASI encourages developers to omit semicolons except when absolutely required, it is nevertheless a consequence of ASI. If ASI did not exist, the would be no missing semicolons.

That is true but not what was claimed. Please stick to the argument we were having.

The claim was two semantics, A and B, for a given program, due to ASI. That is false and I don't see anyone seriously claiming otherwise.

Two semantics for two different programs, one an extension of the other by source concatenation, is a problem.

Now, if you want to use that as a trump card to argue a different argument, that we should get rid of ASI, then lotsa luck!

ASI is not going away in any forseeable non-opt-in version of the language.

Given this fact, should we bemoan it or make only a "no asi" pragma and hope people use that under opt-in (where the pragma is allowed), and waste too much of their lives on adding semicolons when migrating code, etc.?

Or should we consider that the problems created by ASI might better be solved by evolving ASI in a backward-compatible fashion, so old code migrates with the same meaning (or at most an early error, for a hard case) into Harmony, and new code is free of some or all of the hazards this thread has identified?

I say the latter wins. Just a "no asi" pragma may happen too, but it ain't gonna improve the lives of enough JS hackers to matter, IMHO.

# Garrett Smith (13 years ago)

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 19, 2011, at 3:08 PM, Garrett Smith wrote:

On 4/19/11, Isaac Schlueter <i at izs.me> wrote:

ASI didn't change the program behavior. ASI didn't happen in that example.

Isaac is correct.

Newline elision changed the program behavior.

Or newlines being insignificant whitespace, let's say.

I don't see how whitespace is relevant here.

No, MyWidget = function(){} was not explicitly terminated by a semicolon. The end of the input stream is reached and a semicolon is inserted.

Not in the concatenation of a.js and b.js. Please attend to your own example.

No, when a.js exists on its own, a semicolon is inserted. When concatenated with b.js that doesn't happen.

Only if a.js is processed as a Program (the grammar's goal nonterminal), in which case, yes, ASI kicks in, and no, there is no subsequent ( or [ or similar input to cause trouble.

That's what I'm talking about.

"Newline elision" did not change behavior.

Concatenation of a.js and b.js results in behavior that is not the same as when a.js and b.js are in separate files.

That is true, but now you are changing the terms of the debate. The claim that ASI does not affect non-error semantics applies for any given Program -- not for Program X and Program Y.

Is it "program" or "Program" we're discussing? What'd I write? If I wrote "Program" I take it back. But if I wrote "program" then I got what I meant write (and pedantic misquoting is a waste of time).

What you describe is a hazard, we've covered it. But your argument foundered on ASI applying differently on two different programs. Of course that can happen.

If ASI is warned by the interpreted, then the developer of a.js will know before running through compression tools. If that ASI were to be an error, then it would be fail fast. Call me a hater if it makes you feel better, but I find fail fast to be a lot better.

# Garrett Smith (13 years ago)

On 4/19/11, Garrett Smith <dhtmlkitchen at gmail.com> wrote:

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 19, 2011, at 3:08 PM, Garrett Smith wrote:

On 4/19/11, Isaac Schlueter <i at izs.me> wrote:

ASI didn't change the program behavior. ASI didn't happen in that example.

Isaac is correct.

Newline elision changed the program behavior.

Oh, yep. I wrote "program". Tsk.

[...][

Is it "program" or "Program" we're discussing? What'd I write? If I wrote "Program" I take it back. But if I wrote "program" then I got what I meant write (and pedantic misquoting is a waste of time).

"Right", not "write". Now that's ironic.

# Brendan Eich (13 years ago)

On Apr 19, 2011, at 3:54 PM, Garrett Smith wrote:

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote:

On Apr 19, 2011, at 3:08 PM, Garrett Smith wrote:

On 4/19/11, Isaac Schlueter <i at izs.me> wrote:

Newline elision changed the program behavior.

Or newlines being insignificant whitespace, let's say.

I don't see how whitespace is relevant here.

The newline creates an expectation of statement termination in many readers. That's all.

No, MyWidget = function(){} was not explicitly terminated by a semicolon. The end of the input stream is reached and a semicolon is inserted.

Not in the concatenation of a.js and b.js. Please attend to your own example.

No, when a.js exists on its own, a semicolon is inserted.

I said that!

Look, I just wrote "Not in the concatenation of a.js and b.js." You then tried to rebut by talking about a.js in isolation.

Let's get off this merry-go-round. We seem to agree, but you are not arguing that semantics of a given program can change due to ASI. Concatenation makes a new program. Yes, this is commonly done and therefore ASI going away due to concatenation is a hazard.

If ASI is warned by the interpreter, then the developer of a.js will know before running through compression tools. If that ASI were to be an error, then it would be fail fast. Call me a hater if it makes you feel better, but I find fail fast to be a lot better.

No, Ollie is the hater ;-).

You are asking for an warning when a file ends without a semicolon required by the grammar, and ASI kicks in. Fair point, good idea. It's not going to do enough by itself, since warnings are easy to miss, and often annoy the wrong party (the user of content, where the dev is long gone).

But good point. Indeed, feel free to file a bug at bugzilla.mozilla.org asking for such a warning. I'll support it.

# Garrett Smith (13 years ago)

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote: [...]

But good point. Indeed, feel free to file a bug at bugzilla.mozilla.org asking for such a warning. I'll support it.

I'll do it.

# Garrett Smith (13 years ago)

On 4/19/11, Brendan Eich <brendan at mozilla.com> wrote: [...]

You are asking for an warning when a file ends without a semicolon required by the grammar, and ASI kicks in. Fair point, good idea. It's not going to do enough by itself, since warnings are easy to miss, and often annoy the wrong party (the user of content, where the dev is long gone).

A good start.

But good point. Indeed, feel free to file a bug at bugzilla.mozilla.org asking for such a warning. I'll support it.

bugzilla.mozilla.org/show_bug.cgi?id=651346