Refutable pattern

# Andreas Rossberg (13 years ago)

I wrote up the semantics of refutable destructuring as discussed in yesterday's meeting:

harmony:refutable_matching

In particular, this defines the meaning of the ?-operator in a fairly straightforward manner.

The page also describes how the proposed matching semantics would readily be applicable to a pattern matching switch, and how it would potentially allow us to turn 'undefined' into a keyword.

# Axel Rauschmayer (13 years ago)

Beautiful.

What do question marks in value (as opposed to key) positions mean? Example: { a: x? }

How does this work grammatically (ternary operator?)?

# Andreas Rossberg (13 years ago)

On 1 February 2013 10:56, Axel Rauschmayer <axel at rauschma.de> wrote:

Beautiful.

What do question marks in value (as opposed to key) positions mean? Example: { a: x? }

Not much: a plain identifier 'x' is always matches anyway, i.e. is already irrefutable, so wrapping a '?' around it does not have any effect (it's like writing "if (true)" or whatever). I removed the redundant example.

How does this work grammatically (ternary operator…)?

That still has to be worked out. I'd actually prefer a prefixed ?, since it is quite easy to overlook a postfix one trailing a longish pattern when reading code. But that may be more difficult to reconcile with the existing syntax.

# Brandon Benvie (13 years ago)

A postfix '?' would require backtracking when the next '}' is found...I think?

# Andreas Rossberg (13 years ago)

Yeah. I admit that I don't remember much of the earlier discussions on respective parsing difficulties. Naively, it would seem to me that a prefix ? should actually be easier to parse.

# Andreas Rossberg (13 years ago)

I just played with prefix '?' syntax and think it really looks nicer. For the time being, I changed the wiki to use that.

# Brandon Benvie (13 years ago)

Thinking about it further, I'm pretty sure the parsing looks like:

find '?', parse next expression, parse next token. If that token is ':' then the expression is a ternary, otherwise it's a refutable expression

But if this is limited to LHS assignables then either works since the only things allowed are identifiers or patterns.

# Claude Pache (13 years ago)

Le 1 févr. 2013 à 20:00, Andreas Rossberg <rossberg at google.com> a écrit :

On 1 February 2013 18:24, Brandon Benvie <brandon at brandonbenvie.com> wrote:

A postfix '?' would require backtracking when the next '}' is found...I think?

Yeah. I admit that I don't remember much of the earlier discussions on respective parsing difficulties. Naively, it would seem to me that a prefix '?' should actually be easier to parse.

/Andreas

Hello,

I've do some analysis when there could be an ambiguity between a postfix "?" and the conditional operator (? :).

--

In the proposed grammar, it seems to me that a postfix "?" can appear in Pattern in the following positions only:

(1) inside a Pattern, before the following tokens: ": , ] }" (colon, comma, closing bracket, closing brace) and apparently before another postfix-"?" (although it would be of no practical use); (2) at the end of a Pattern.

In case (1), there is no ambiguity with the conditional operator. In case (2), one has to further analyse where a Pattern could appear.

--

More generally, let's suppose that "?" can be either an infix (that is to say, between two sub-expressions) when used as part of the conditional operator, or a suffix in our case. We have experience with a somewhat symmetrical case, namely tokens that can be either infix or prefix in expressions; these are: + - / ( [ and I think they don't pose much problem, except we can't rely on ASI when such a token appears at the beginning of line.

It seems to me that the problems we would meet are the following two cases:

(A) When "?" is followed by a token that can be either prefix or infix ("+ - / ( ["), there is ambiguity, because in

expression ? + expression 

one cannot decide if it means "expression (infix-?) (prefix-+) expression" or "expression (suffix-?) (infix-+) expression".

(B) When "?" appears at the end of the line, we cannot rely on ASI (a situation symmetrical to one of "+ - / ( [" appearing at the beginning of a line).

--

For parsing purpose, when we encounter a "?", maybe one could just look ahead the next token: if it might be the beginning of an expression, we decree that "?" is part of the conditional operator. We have to determine which token can appear at the beginning of an expression to see if it is reasonable solution.

# Brendan Eich (13 years ago)

Brandon Benvie wrote:

A postfix '?' would require backtracking when the next '}' is found...I think?

First, let me kill the idea of prefix-'?'.

Prefix-'?' in an AssignmentPattern in an AssignmentExpression that is an ExpressionStatement is ambiguous with the '?' in a ConditionalExpression, if the programmer mistakenly relies on ASI as if prefix-'?' were not used. Consider refactoring ES6 code with prefix-'?' from

A = B
{D} = C

L: E

where D is the interior of an object destructuring pattern, to

A = B
?{D} = C

L: E

With postfix-'?' there's no such hazard.

However, naively adding postfix-'?' to the JS grammar does make an ambiguity, a shift/reduce conflict on '?' -- shift to keep parsing a ConditionalExpression, or reduce to a '?'-suffixed pattern in a destructuring assignment.

ES6 draft already splits destructuring binding pattern and assignment pattern subgrammars (binding vs. assignment, i.e. var/let/const in front vs. no binding keyword). We could therefore forbid postfix-'?' in destructuring assignment patterns but not in destructuring binding patterns.

But this is a bit lame, and a refactoring speedbump. I wrote on the board the following argument for destructuring assignment to be supported along with binding, which MarkM found persuasive:

function* Fib() { let [a, b] = [0, 1]; // destructuring binding while (true) { yield a; [a, b] = [b, a + b]; // destructuring assignment } }

The two destructurings might want postfix-'?' equally in a different function with similar structure.

One solution already used in ECMA-262 to regain LR(1) parsing of the standard grammar is lookahead restriction. Observe that with postfix-'?' in patterns, the legal lookahead set is {'=', ':' , ',', '}', ']'}. So we could simply write a lookahead restriction.

We could factor the grammar harder, but this would duplicate most of the expression grammar, akin to how the -NoIn productions duplicate part of that sub-grammar. I say let's use a lookahead restriction and get on with our lives.

Does anyone see a flaw?

# Brendan Eich (13 years ago)

Brendan Eich wrote:

Prefix-'?' in an AssignmentPattern in an AssignmentExpression that is an ExpressionStatement is ambiguous with the '?' in a ConditionalExpression, if the programmer mistakenly relies on ASI as if prefix-'?' were not used. Consider refactoring ES6 code with prefix-'?' from

A = B {D} = C L: E

where D is the interior of an object destructuring pattern, to

A = B ?{D} = C L: E

Oops, C \n L can't parse, so this will error. The difficulty could arise only if a ':' could be associated with the prefix-'?' to make a larger ConditionalExpression. I don't see a sentence where that could happen without an error, so this is less of an issue.

But it still seems like a good reason to kill prefix-'?', given the desire for patterns to be the same in destructuring binding and destructuring assignment.

# Brendan Eich (13 years ago)

Claude Pache wrote:

In case (1), there is no ambiguity with the conditional operator. In case (2), one has to further analyse where a Pattern could appear.

Patterns can appear in formal parameter position, on the left of assignment ('=', not '+=' etc.), on the left of '=' in an initialized binding. I forgot about parameters in my suggested lookahead restriction, but the closing ')' is easy to handle and cannot occur legally after '?' in a ConditionalExpression.

So, lookahead restriction FTW! Or at least, let's get on with our lives. It saves having to duplicate most of the expression grammar a la the hated -NoIn productions.

# Brendan Eich (13 years ago)

Brendan Eich wrote:

One solution already used in ECMA-262 to regain LR(1) parsing of the standard grammar is lookahead restriction. Observe that with postfix-'?' in patterns, the legal lookahead set is {'=', ':' , ',', '}', ']'}. So we could simply write a lookahead restriction.

Amended per the formal parameter case that Claude Pache provoked me to consider:

lookahead not in {'=', ':' , ',', '}', ']', ')'}

after the postfix-'?' optionally following an object or array literal in the main (cover) grammar.

# Brendan Eich (13 years ago)

Sorry if I was a thread-killer, posting four times in a row.

On balance we have:

prefix-? pros:

  • LR(1) grammar without ambiguity or lookahead restriction.

prefix-? cons:

  • ASI hazard if ? starts an intended destructuring assignment expression.

suffix-? pros:

suffix-? cons:

  • Requires yet another a lookahead restriction on top of LR(1).

I'm ok either way. I made my best case for suffix-?. Thoughts and comments welcome.

# Brendan Eich (13 years ago)

Brendan Eich wrote:

Sorry if I was a thread-killer, posting four times in a row.

On balance we have:

prefix-? pros:

  • LR(1) grammar without ambiguity or lookahead restriction.

Forgot to add the one Andreas cited:

  • Easier to see prefix-? in front of long object or array pattern.
# Oliver Hunt (13 years ago)

On Feb 5, 2013, at 11:00 PM, Brendan Eich <brendan at mozilla.com> wrote:

Sorry if I was a thread-killer, posting four times in a row.

On balance we have:

prefix-? pros:

  • LR(1) grammar without ambiguity or lookahead restriction.

What's the production that takes this out of LL(1)?

prefix-? cons:

  • ASI hazard if ? starts an intended destructuring assignment expression.

That's interestingly icky.

suffix-? pros:

  • Matches CoffeeScript

I don't really consider that a pro, any more than matching any other language. This obsession with coffee script comparisons isn't useful.

I think prefix ? is easier from a reading point of view, but I'm not really married to either.

# Brendan Eich (13 years ago)

Oliver Hunt wrote:

On Feb 5, 2013, at 11:00 PM, Brendan Eich<brendan at mozilla.com> wrote:

Sorry if I was a thread-killer, posting four times in a row.

On balance we have:

prefix-? pros:

  • LR(1) grammar without ambiguity or lookahead restriction.

What's the production that takes this out of LL(1)?

Nothing -- I cited LR(1) because that's the formalism the ECMA-262 spec uses, validated way back in ES3, supposed to be validated again for ES6.

prefix-? cons:

  • ASI hazard if ? starts an intended destructuring assignment expression.

That's interestingly icky.

Yep :-P.

suffix-? pros:

  • Matches CoffeeScript I don't really consider that a pro, any more than matching any other language. This obsession with coffee script comparisons isn't useful.

Now now. An unweighted dump of pros and cons with one CoffeeScript reference does not make an "obsession". I don't actually think this matters, since we have gone against CoffeeScript with for-of. So, obsess yourself :-P.

I think prefix ? is easier from a reading point of view, but I'm not really married to either.

Agreed, and I posted mainly to try to get to consensus. Prefix-? looks like it is in the lead.

# Russell Leggett (13 years ago)

I think prefix ? is easier from a reading point of view, but I'm not

really married to either.

Agreed, and I posted mainly to try to get to consensus. Prefix-? looks like it is in the lead.

I think for the case of a long pattern with the ? outside the {}s, a prefix ? is easier to read. However, I think the reason why coffeescript and typescript have gone with a suffix is because it is more common and natural with the way ? is already used. To me it screams optional from regular expression languages, but then is also the obvious placement for english and many other written languages for the uninitiated.

The regular expression notation is probably the most compelling reason to me for suffix-?. It is widely used across different regular expression implementations, including ecmascript's. The regular expression roots have also made it used in other related ways. For example, many different schema notations use it like DTDs

<!ELEMENT ARTICLE (TITLE?, P*)>

Or relax ng compact syntax

element note { text }?

The precedence is more than just coffeescript.

# Brendan Eich (13 years ago)

You're right (and take that, olliej :-P).

The other synergy with CoffeeScript is the existential operator in the expression grammar (not just patterns):

if (foo?) ...

And the o?.p and o?m() variants, but these won't make it as-is due to lack of compositionality for ?. and no way to add ?( to ECMA-262 given ?: expressions.