Nested Quasis

# Erik Arvidsson (14 years ago)

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

Under the open issues for Quasi Literals,
http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that
supporting nested quasi literals is easier than not supporting them.
What is the argument for not supporting nesting? Can we resolve this?

-- 
erik

# Brendan Eich (14 years ago)

This is really relaxing the (over-restrictive, IMHO) design that says that for ${expr} in a quasi, expr can be only a very few forms such as Identifier, Identifier '.' IdentifierName, and the like -- right?

I.e. if we allow expr to be the grammar's Expression non-terminal, then it follows that quasi-literals nest inside ${...} in outer quasis. +1 on this.

This is really relaxing the (over-restrictive, IMHO) design that says 
that for ${expr} in a quasi, expr can be only a very few forms such as 
Identifier, Identifier '.' IdentifierName, and the like -- right?

I.e. if we allow expr to be the grammar's Expression non-terminal, then 
it follows that quasi-literals nest inside ${...} in outer quasis. +1 on 
this.

/be

> Erik Arvidsson <mailto:erik.arvidsson at gmail.com>
> January 28, 2012 2:54 PM
> Under the open issues for Quasi Literals,
> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
> topic of nesting is brought up.
>
> After implementing Quasi Literals in Traceur it is clear that
> supporting nested quasi literals is easier than not supporting them.
> What is the argument for not supporting nesting? Can we resolve this?
>

# Erik Arvidsson (14 years ago)

On Sat, Jan 28, 2012 at 16:07, Brendan Eich <brendan at mozilla.org> wrote:

This is really relaxing the (over-restrictive, IMHO) design that says that for ${expr} in a quasi, expr can be only a very few forms such as Identifier, Identifier '.' IdentifierName, and the like -- right?

Yes. Only allowing IdentifierExpression and MemberLookup is too restrictive. A realistic use case is to allow binary operator and once you allow that it makes sense to allow any expression which includes other quasi literals which leads to the conclusion that nested quasi literals should be allowed.

On Sat, Jan 28, 2012 at 16:07, Brendan Eich <brendan at mozilla.org> wrote:
> This is really relaxing the (over-restrictive, IMHO) design that says that
> for ${expr} in a quasi, expr can be only a very few forms such as
> Identifier, Identifier '.' IdentifierName, and the like -- right?

Yes. Only allowing IdentifierExpression and MemberLookup is too
restrictive. A realistic use case is to allow binary operator and once
you allow that it makes sense to allow any expression which includes
other quasi literals which leads to the conclusion that nested quasi
literals should be allowed.

-- 
erik

# Mark S. Miller (14 years ago)

On Sat, Jan 28, 2012 at 5:54 PM, Erik Arvidsson <erik.arvidsson at gmail.com>wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them.

+1000. Quasis as originally proposed had no such restriction. The Unicorns example at harmony:quasis#nesting

is I think fairly representative of what will become a common kind of use case -- unless we cripple quasis. I would be interested in seeing what this code looks like when refactored to live within this restriction.

In E we have quasis that are somewhat similar and somewhat different. But we make much use of the ability to place arbitrary expressions within the dollar-hole, including nested quasis. I think our quasis as well should allow any expression. The issue is not just nested quasis.

On Sat, Jan 28, 2012 at 5:54 PM, Erik Arvidsson <erik.arvidsson at gmail.com>wrote:

> Under the open issues for Quasi Literals,
> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
> topic of nesting is brought up.
>
> After implementing Quasi Literals in Traceur it is clear that
> supporting nested quasi literals is easier than not supporting them.
>

+1000. Quasis as originally proposed had no such restriction. The Unicorns
example at <http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting>
is I think fairly representative of what will become a common kind of use
case -- unless we cripple quasis. I would be interested in seeing what this
code looks like when refactored to live within this restriction.

In E we have quasis that are somewhat similar and somewhat different. But
we make much use of the ability to place arbitrary expressions within the
dollar-hole, including nested quasis. I think our quasis as well should
allow any expression. The issue is not just nested quasis.



> What is the argument for not supporting nesting? Can we resolve this?
>
> --
> erik
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>



-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120128/0be0f283/attachment.html>

# Waldemar Horwat (14 years ago)

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

 Waldemar

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
> Under the open issues for Quasi Literals,
> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
> topic of nesting is brought up.
>
> After implementing Quasi Literals in Traceur it is clear that
> supporting nested quasi literals is easier than not supporting them.
> What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.

     Waldemar

# Allen Wirfs-Brock (14 years ago)

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part. A few [no whitespace here] tokens will probably be needed

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>> Under the open issues for Quasi Literals,
>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>> topic of nesting is brought up.
>> 
>> After implementing Quasi Literals in Traceur it is clear that
>> supporting nested quasi literals is easier than not supporting them.
>> What is the argument for not supporting nesting? Can we resolve this?
> 
> This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part. A few [no whitespace here] tokens will probably be needed

Allen

# Mike Samuel (14 years ago)

2012/1/31 Allen Wirfs-Brock <allen at wirfs-brock.com>:

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part. A few [no whitespace here] tokens will probably be needed

I addressed this at js-quasis-libraries-and-repl.googlecode.com/svn/trunk/tokenize.html

2012/1/31 Allen Wirfs-Brock <allen at wirfs-brock.com>:
>
> On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:
>
>> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>>> Under the open issues for Quasi Literals,
>>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>>> topic of nesting is brought up.
>>>
>>> After implementing Quasi Literals in Traceur it is clear that
>>> supporting nested quasi literals is easier than not supporting them.
>>> What is the argument for not supporting nesting? Can we resolve this?
>>
>> This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.
>
> I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part. A few [no whitespace here] tokens will probably be needed

I addressed this at
http://js-quasis-libraries-and-repl.googlecode.com/svn/trunk/tokenize.html

# Allen Wirfs-Brock (14 years ago)

On Jan 31, 2012, at 4:06 PM, Mike Samuel wrote:

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part. A few [no whitespace here] tokens will probably be needed

I addressed this at js-quasis-libraries-and-repl.googlecode.com/svn/trunk/tokenize.html

A more direct like to this from the Quasis ecmascript.org wiki page would be helpful. The only current link does directly to the demo shell.

On Jan 31, 2012, at 4:06 PM, Mike Samuel wrote:

>> I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part. A few [no whitespace here] tokens will probably be needed
> 
> I addressed this at
> http://js-quasis-libraries-and-repl.googlecode.com/svn/trunk/tokenize.html
> 

A more direct like to this from the Quasis ecmascript.org wiki page would be helpful.  The only current link does directly to the demo shell.

Allen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120201/6a2f11d8/attachment.html>

# Waldemar Horwat (14 years ago)

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup. I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

 Waldemar

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:
>
> On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:
>
>> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>>> Under the open issues for Quasi Literals,
>>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>>> topic of nesting is brought up.
>>>
>>> After implementing Quasi Literals in Traceur it is clear that
>>> supporting nested quasi literals is easier than not supporting them.
>>> What is the argument for not supporting nesting? Can we resolve this?
>>
>> This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.
>
> I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup.  I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

     Waldemar

# Allen Wirfs-Brock (14 years ago)

On Feb 1, 2012, at 11:28 AM, Waldemar Horwat wrote:

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup. I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

Was there some particular issue you were running into?

On Feb 1, 2012, at 11:28 AM, Waldemar Horwat wrote:

> On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:
>> 
>> On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:
>> 
>>> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>>>> Under the open issues for Quasi Literals,
>>>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>>>> topic of nesting is brought up.
>>>> 
>>>> After implementing Quasi Literals in Traceur it is clear that
>>>> supporting nested quasi literals is easier than not supporting them.
>>>> What is the argument for not supporting nesting? Can we resolve this?
>>> 
>>> This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.
>> 
>> I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.
> 
> I'd love to see this little cleanup.  I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

Was there some particular issue you were running into?

# Mike Samuel (14 years ago)

2012/2/1 Waldemar Horwat <waldemar at google.com>:

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup. I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

What should I put in the proposal? A delta to the lexical grammar?

2012/2/1 Waldemar Horwat <waldemar at google.com>:
> On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:
>>
>>
>> On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:
>>
>>> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>>>>
>>>> Under the open issues for Quasi Literals,
>>>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>>>> topic of nesting is brought up.
>>>>
>>>> After implementing Quasi Literals in Traceur it is clear that
>>>> supporting nested quasi literals is easier than not supporting them.
>>>> What is the argument for not supporting nesting? Can we resolve this?
>>>
>>>
>>> This has been hashed out in committee before.  Do you have a solution to
>>> the grammar problems, such as having a full ECMAScript parser inside the
>>> lexer?  You can't just count parentheses because that breaks regexps.
>>
>>
>> I would think the solution to this is pretty straightforward.  Basically,
>> a Quasi is not a single token.   the grammar in the proposal can almost be
>> read that way right now.   It should only take a little cleanup to factor it
>> into a pure lexical part and a syntactic part.
>
>
> I'd love to see this little cleanup.  I thought about it for a while and
> couldn't come up with it myself; I'm not sure it can even be done.

What should I put in the proposal?  A delta to the lexical grammar?

# Allen Wirfs-Brock (14 years ago)

On Feb 1, 2012, at 12:12 PM, Mike Samuel wrote:

2012/2/1 Waldemar Horwat <waldemar at google.com>:

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup. I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

What should I put in the proposal? A delta to the lexical grammar?

I expect that what we will ultimately end up with is some token additions to the lexical grammar and some new syntactic grammar productions that put those tokens together into complete Quasis. If you want to work on a first cut at those it would be great. Otherwise, I'll need to do the work when I start editing Quasis into the actual specification.

On Feb 1, 2012, at 12:12 PM, Mike Samuel wrote:

> 2012/2/1 Waldemar Horwat <waldemar at google.com>:
>> On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:
>>> 
>>> I would think the solution to this is pretty straightforward.  Basically,
>>> a Quasi is not a single token.   the grammar in the proposal can almost be
>>> read that way right now.   It should only take a little cleanup to factor it
>>> into a pure lexical part and a syntactic part.
>> 
>> 
>> I'd love to see this little cleanup.  I thought about it for a while and
>> couldn't come up with it myself; I'm not sure it can even be done.
> 
> What should I put in the proposal?  A delta to the lexical grammar?
> 

I expect that what we will ultimately end up with is some token additions to the lexical grammar and some new syntactic grammar productions that put those tokens together into complete Quasis.  If you want to work on a first cut at those it would be great.  Otherwise, I'll need to do the work when I start editing Quasis into the actual specification. 

Allen

# Waldemar Horwat (14 years ago)

On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:

On Feb 1, 2012, at 11:28 AM, Waldemar Horwat wrote:

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup. I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

Was there some particular issue you were running into?

Here's one which I couldn't express in a lexer grammar: How to restart the quasi after an included expression is over.

 Waldemar

On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
>
> On Feb 1, 2012, at 11:28 AM, Waldemar Horwat wrote:
>
>> On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:
>>>
>>> On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:
>>>
>>>> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>>>>> Under the open issues for Quasi Literals,
>>>>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>>>>> topic of nesting is brought up.
>>>>>
>>>>> After implementing Quasi Literals in Traceur it is clear that
>>>>> supporting nested quasi literals is easier than not supporting them.
>>>>> What is the argument for not supporting nesting? Can we resolve this?
>>>>
>>>> This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.
>>>
>>> I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.
>>
>> I'd love to see this little cleanup.  I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.
>
> Was there some particular issue you were running into?

Here's one which I couldn't express in a lexer grammar: How to restart the quasi after an included expression is over.

     Waldemar

# Allen Wirfs-Brock (14 years ago)

On Feb 1, 2012, at 5:33 PM, Waldemar Horwat wrote:

On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:

On Feb 1, 2012, at 11:28 AM, Waldemar Horwat wrote:

On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:

On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:

On 01/28/2012 02:54 PM, Erik Arvidsson wrote:

Under the open issues for Quasi Literals, harmony:quasis#nesting , the topic of nesting is brought up.

After implementing Quasi Literals in Traceur it is clear that supporting nested quasi literals is easier than not supporting them. What is the argument for not supporting nesting? Can we resolve this?

This has been hashed out in committee before. Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer? You can't just count parentheses because that breaks regexps.

I would think the solution to this is pretty straightforward. Basically, a Quasi is not a single token. the grammar in the proposal can almost be read that way right now. It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.

I'd love to see this little cleanup. I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.

Was there some particular issue you were running into?

Here's one which I couldn't express in a lexer grammar: How to restart the quasi after an included expression is over.

I wouldn't because I would produce the complete quasi as a single token. I would leave it up to the syntactic grammar to assemble the quasi pieces and inclusion expression into a complete unit.

On Feb 1, 2012, at 5:33 PM, Waldemar Horwat wrote:

> On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
>> 
>> On Feb 1, 2012, at 11:28 AM, Waldemar Horwat wrote:
>> 
>>> On 01/31/2012 03:04 PM, Allen Wirfs-Brock wrote:
>>>> 
>>>> On Jan 31, 2012, at 2:36 PM, Waldemar Horwat wrote:
>>>> 
>>>>> On 01/28/2012 02:54 PM, Erik Arvidsson wrote:
>>>>>> Under the open issues for Quasi Literals,
>>>>>> http://wiki.ecmascript.org/doku.php?id=harmony:quasis#nesting , the
>>>>>> topic of nesting is brought up.
>>>>>> 
>>>>>> After implementing Quasi Literals in Traceur it is clear that
>>>>>> supporting nested quasi literals is easier than not supporting them.
>>>>>> What is the argument for not supporting nesting? Can we resolve this?
>>>>> 
>>>>> This has been hashed out in committee before.  Do you have a solution to the grammar problems, such as having a full ECMAScript parser inside the lexer?  You can't just count parentheses because that breaks regexps.
>>>> 
>>>> I would think the solution to this is pretty straightforward.  Basically, a Quasi is not a single token.   the grammar in the proposal can almost be read that way right now.   It should only take a little cleanup to factor it into a pure lexical part and a syntactic part.
>>> 
>>> I'd love to see this little cleanup.  I thought about it for a while and couldn't come up with it myself; I'm not sure it can even be done.
>> 
>> Was there some particular issue you were running into?
> 
> Here's one which I couldn't express in a lexer grammar: How to restart the quasi after an included expression is over.

I wouldn't because I would produce the complete quasi as a single token.  I would leave it up to the syntactic grammar to assemble the quasi pieces and inclusion expression into a complete unit.

Allen

# Douglas Crockford (14 years ago)

On 11:59 AM, Waldemar Horwat wrote:

On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote: Here's one which I couldn't express in a lexer grammar: How to restart the quasi after an included expression is over.

If quasis are not nested, then the lexical rule is really simple: Just match the `s, and within the literal, match the {}s.

I would prefer to keep it simple, unless there is a compelling requirement to provide nesting. If we do the simple version now, we could allow the nested case in the future.

On 11:59 AM, Waldemar Horwat wrote:
> On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
> Here's one which I couldn't express in a lexer grammar: How to restart
> the quasi after an included expression is over.

If quasis are not nested, then the lexical rule is really simple: Just 
match the `s, and within the literal, match the {}s.

I would prefer to keep it simple, unless there is a compelling 
requirement to provide nesting. If we do the simple version now, we 
could allow the nested case in the future.

# Mark S. Miller (14 years ago)

On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com>wrote:

On 11:59 AM, Waldemar Horwat wrote:

On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote: Here's one which I couldn't express in a lexer grammar: How to restart the quasi after an included expression is over.

If quasis are not nested, then the lexical rule is really simple: Just match the `s, and within the literal, match the {}s.

I would prefer to keep it simple, unless there is a compelling requirement to provide nesting. If we do the simple version now, we could allow the nested case in the future.

When we came up with this "simplification", I thought I could live with it. Now, having tried to write some examples within these restrictions, I find it unusable.

I think we're overestimating the parsing difficulty. I'll let Mike speak for the real plan. But I'd like to explain what I do in E, so that we can see that none of this need be complicated. It does involve an interaction between the parsing and lexing levels, but much less complex than you may expect, and comparable (IMO less) than the existing unclean interaction that JS already has:

Lexing grammar has four new token types.

QuasiOnly ::

    ` QuasiChar* `

QuasiOpen ::

    ` QuasiChar* $

QuasiMiddle ::

    QuasiChar*

QuasiEnd ::

    QuasiChar `

Parsing grammar:

quasiExpr :

    Identifier? quasiExprLiteral

quasiExprLiteral :

    QuasiOnly

    QuasiOpen quasiHole (QuasiMiddle quasiHole)* QuasiClose

quasiHole :

    Identifier

    curlyBalancedTokenSequence

curlyBalancedTokenSequence :

    { expr }

The key thing is that the curlyBalancedTokenSequence starts a normal lexical expression context and counts curlies. When it sees a "}" tokenthat matches its opening "{", the curlyBalancedTokenSequence is done, and we proceed to continue lexing QuasiChar* until we've lexed a QuasiMiddle or QuasiEnd.

Of course, if you don't need to keep you parser and lexer so strongly separated, you can just use the above grammar directly as a one-level grammar, where you use the full expression parser after the "{". This is what I did the first time in E. Either way works. The reason I changed to the looser coupling is so that I could fully lex a program that didn't parse, so I could give more informative error messages.

On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com>wrote:

> On 11:59 AM, Waldemar Horwat wrote:
>
>> On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
>> Here's one which I couldn't express in a lexer grammar: How to restart
>> the quasi after an included expression is over.
>>
>
> If quasis are not nested, then the lexical rule is really simple: Just
> match the `s, and within the literal, match the {}s.
>
> I would prefer to keep it simple, unless there is a compelling requirement
> to provide nesting. If we do the simple version now, we could allow the
> nested case in the future.
>

When we came up with this "simplification", I thought I could live with it.
Now, having tried to write some examples within these restrictions, I find
it unusable.

I think we're overestimating the parsing difficulty. I'll let Mike speak
for the real plan. But I'd like to explain what I do in E, so that we can
see that none of this need be complicated. It does involve an interaction
between the parsing and lexing levels, but much less complex than you may
expect, and comparable (IMO less) than the existing unclean interaction
that JS already has:

Lexing grammar has four new token types.

    QuasiOnly ::

        ` QuasiChar* `

    QuasiOpen ::

        ` QuasiChar* $

    QuasiMiddle ::

        QuasiChar*

    QuasiEnd ::

        QuasiChar `

Parsing grammar:

    quasiExpr :

        Identifier? quasiExprLiteral

    quasiExprLiteral :

        QuasiOnly

        QuasiOpen quasiHole (QuasiMiddle quasiHole)* QuasiClose

    quasiHole :

        Identifier

        curlyBalancedTokenSequence

    curlyBalancedTokenSequence :

        { expr }

The key thing is that the curlyBalancedTokenSequence starts a normal
lexical expression context and counts curlies. When it sees a "}"
*token*that matches its opening "{", the curlyBalancedTokenSequence is
done, and
we proceed to continue lexing QuasiChar* until we've lexed a QuasiMiddle or
QuasiEnd.

Of course, if you don't need to keep you parser and lexer so strongly
separated, you can just use the above grammar directly as a one-level
grammar, where you use the full expression parser after the "{". This is
what I did the first time in E. Either way works. The reason I changed to
the looser coupling is so that I could fully lex a program that didn't
parse, so I could give more informative error messages.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/7a9be7f4/attachment-0001.html>

# Mark S. Miller (14 years ago)

On Thu, Feb 2, 2012 at 11:03 AM, Mark S. Miller <erights at google.com> wrote:

Of course, if you don't need to keep you parser and lexer so strongly separated, you can just use the above grammar directly as a one-level grammar, where you use the full expression parser after the "{". This is what I did the first time in E. Either way works. The reason I changed to the looser coupling is so that I could fully lex a program that didn't parse, so I could give more informative error messages.

This loose coupling is also exactly what we want for syntax highlighting. Syntax highlighting mainly (always?) distinguishes lexical categories, so we want it to be accurate for a program with only parse errors.

On Thu, Feb 2, 2012 at 11:03 AM, Mark S. Miller <erights at google.com> wrote:

>
> Of course, if you don't need to keep you parser and lexer so strongly
> separated, you can just use the above grammar directly as a one-level
> grammar, where you use the full expression parser after the "{". This is
> what I did the first time in E. Either way works. The reason I changed to
> the looser coupling is so that I could fully lex a program that didn't
> parse, so I could give more informative error messages.
>

This loose coupling is also exactly what we want for syntax highlighting.
Syntax highlighting mainly (always?) distinguishes lexical categories, so
we want it to be accurate for a program with only parse errors.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/f63b0b75/attachment.html>

# Waldemar Horwat (14 years ago)

On 02/02/2012 11:03 AM, Mark S. Miller wrote:

On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com <mailto:douglas at crockford.com>> wrote:
On 11:59 AM, Waldemar Horwat wrote:

    On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
    Here's one which I couldn't express in a lexer grammar: How to restart
    the quasi after an included expression is over.


If quasis are not nested, then the lexical rule is really simple: Just match the `s, and within the literal, match the {}s.

I would prefer to keep it simple, unless there is a compelling requirement to provide nesting. If we do the simple version now, we could allow the nested case in the future.
When we came up with this "simplification", I thought I could live with it. Now, having tried to write some examples within these restrictions, I find it unusable.

I think we're overestimating the parsing difficulty. I'll let Mike speak for the real plan. But I'd like to explain what I do in E, so that we can see that none of this need be complicated. It does involve an interaction between the parsing and lexing levels, but much less complex than you may expect, and comparable (IMO less) than the existing unclean interaction that JS already has:

Lexing grammar has four new token types.
 QuasiOnly ::

     ` QuasiChar* `

 QuasiOpen ::

     ` QuasiChar* $

 QuasiMiddle ::

     QuasiChar*

 QuasiEnd ::

     QuasiChar `

(presumably you forgot a * in QuasiEnd?)

That's not a valid lexer grammar. The input

is now ambiguous -- it can lex as either a keyword or a QuasiMiddle. The input

3+`

will now lex as QuasiEnd, which may or may not be what you want.

 Waldemar

On 02/02/2012 11:03 AM, Mark S. Miller wrote:
> On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com <mailto:douglas at crockford.com>> wrote:
>
>     On 11:59 AM, Waldemar Horwat wrote:
>
>         On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
>         Here's one which I couldn't express in a lexer grammar: How to restart
>         the quasi after an included expression is over.
>
>
>     If quasis are not nested, then the lexical rule is really simple: Just match the `s, and within the literal, match the {}s.
>
>     I would prefer to keep it simple, unless there is a compelling requirement to provide nesting. If we do the simple version now, we could allow the nested case in the future.
>
>
> When we came up with this "simplification", I thought I could live with it. Now, having tried to write some examples within these restrictions, I find it unusable.
>
> I think we're overestimating the parsing difficulty. I'll let Mike speak for the real plan. But I'd like to explain what I do in E, so that we can see that none of this need be complicated. It does involve an interaction between the parsing and lexing levels, but much less complex than you may expect, and comparable (IMO less) than the existing unclean interaction that JS already has:
>
> Lexing grammar has four new token types.
>
>      QuasiOnly ::
>
>          ` QuasiChar* `
>
>      QuasiOpen ::
>
>          ` QuasiChar* $
>
>      QuasiMiddle ::
>
>          QuasiChar*
>
>      QuasiEnd ::
>
>          QuasiChar `
(presumably you forgot a * in QuasiEnd?)

That's not a valid lexer grammar.  The input

   if

is now ambiguous -- it can lex as either a keyword or a QuasiMiddle.  The input

   3+`

will now lex as QuasiEnd, which may or may not be what you want.

     Waldemar

# Mark S. Miller (14 years ago)

On Thu, Feb 2, 2012 at 11:27 AM, Waldemar Horwat <waldemar at google.com>wrote:

On 02/02/2012 11:03 AM, Mark S. Miller wrote:

On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com<mailto:
douglas at crockford.com>**> wrote:

On 11:59 AM, Waldemar Horwat wrote:
   On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
   Here's one which I couldn't express in a lexer grammar: How to
restart the quasi after an included expression is over.

If quasis are not nested, then the lexical rule is really simple: Just match the `s, and within the literal, match the {}s.

I would prefer to keep it simple, unless there is a compelling requirement to provide nesting. If we do the simple version now, we could allow the nested case in the future.

When we came up with this "simplification", I thought I could live with it. Now, having tried to write some examples within these restrictions, I find it unusable.

I think we're overestimating the parsing difficulty. I'll let Mike speak for the real plan. But I'd like to explain what I do in E, so that we can see that none of this need be complicated. It does involve an interaction between the parsing and lexing levels, but much less complex than you may expect, and comparable (IMO less) than the existing unclean interaction that JS already has:

Lexing grammar has four new token types.
QuasiOnly ::

    ` QuasiChar* `

QuasiOpen ::

    ` QuasiChar* $

QuasiMiddle ::

    QuasiChar*

QuasiEnd ::

    QuasiChar `
(presumably you forgot a * in QuasiEnd?)

y. I also messed up one more thing:

 QuasiMiddle ::

     QuasiChar* $

Sorry for the confusion.

That's not a valid lexer grammar.

I didn't explain well enough. QuasiMiddle and QuasiEnd apply only after a quasiHole, and they apply immediately after a quasiHole. That's the complexity I was referring to: it introduces yet another lexing context, and the determination about whether we're in that lexing context demands counting curlies -- which a regular expression can't do.

The input

if

is now ambiguous -- it can lex as either a keyword or a QuasiMiddle.

If it occurs immediately after a quasiHole, then it is a QuasiMiddle or QuasiEnd, depending on whether it is terminated by a $ or `. (See correction above).

The input

3+`

will now lex as QuasiEnd, which may or may not be what you want.

Only if after a quasiHole.

On Thu, Feb 2, 2012 at 11:27 AM, Waldemar Horwat <waldemar at google.com>wrote:

> On 02/02/2012 11:03 AM, Mark S. Miller wrote:
>
>  On Thu, Feb 2, 2012 at 5:09 AM, Douglas Crockford <douglas at crockford.com<mailto:
>> douglas at crockford.com>**> wrote:
>>
>>    On 11:59 AM, Waldemar Horwat wrote:
>>
>>        On 02/01/2012 11:35 AM, Allen Wirfs-Brock wrote:
>>        Here's one which I couldn't express in a lexer grammar: How to
>> restart
>>        the quasi after an included expression is over.
>>
>>
>>    If quasis are not nested, then the lexical rule is really simple: Just
>> match the `s, and within the literal, match the {}s.
>>
>>    I would prefer to keep it simple, unless there is a compelling
>> requirement to provide nesting. If we do the simple version now, we could
>> allow the nested case in the future.
>>
>>
>> When we came up with this "simplification", I thought I could live with
>> it. Now, having tried to write some examples within these restrictions, I
>> find it unusable.
>>
>> I think we're overestimating the parsing difficulty. I'll let Mike speak
>> for the real plan. But I'd like to explain what I do in E, so that we can
>> see that none of this need be complicated. It does involve an interaction
>> between the parsing and lexing levels, but much less complex than you may
>> expect, and comparable (IMO less) than the existing unclean interaction
>> that JS already has:
>>
>> Lexing grammar has four new token types.
>>
>>     QuasiOnly ::
>>
>>         ` QuasiChar* `
>>
>>     QuasiOpen ::
>>
>>         ` QuasiChar* $
>>
>>     QuasiMiddle ::
>>
>>         QuasiChar*
>>
>>     QuasiEnd ::
>>
>>         QuasiChar `
>>
> (presumably you forgot a * in QuasiEnd?)
>

y. I also messed up one more thing:


     QuasiMiddle ::

         QuasiChar* $

Sorry for the confusion.



> That's not a valid lexer grammar.


I didn't explain well enough. QuasiMiddle and QuasiEnd apply only after a
quasiHole, and they apply immediately after a quasiHole. That's the
complexity I was referring to: it introduces yet another lexing context,
and the determination about whether we're in that lexing context demands
counting curlies -- which a regular expression can't do.



>  The input
>
>  if
>
> is now ambiguous -- it can lex as either a keyword or a QuasiMiddle.


If it occurs immediately after a quasiHole, then it is a QuasiMiddle or
QuasiEnd, depending on whether it is terminated by a $ or `. (See
correction above).



>  The input
>
>  3+`
>
> will now lex as QuasiEnd, which may or may not be what you want.


Only if after a quasiHole.


>
>
>    Waldemar
>



-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/1cfbe5a6/attachment.html>

# John Tamplin (14 years ago)

I think this could take the same approach as Dart in dealing with embedded expressions - code.google.com/p/dart/source/browse/branches/bleeding_edge/dart/compiler/java/com/google/dart/compiler/parser/DartScanner.java?r=1805#898

Basically, the scanner returns token sequences like:

"foo" => STRING(foo) "foo $bar baz" => STRING_SEGMENT(foo ) STRING_EMBED_EXPR_START

IDENTIFIER(bar) STRING_EMBED_EXPR_END

which the parser then handles normally, with rules like:

string-expression : STRING | string-interpolation ; string-interpolation : ( STRING_SEGMENT? embedded-exp? )* STRING_LAST_SEGMENT ; // a simplification embedded-exp : STRING_EMBED_EXP_START expression STRING_EMBED_EXP_END ;

I don't know if this would cause ambiguities in the JS grammar, or if it would have other issues applying it to quasis in JS, but it keeps a clean separation between the scanner and parser (it does require some additional state in the scanner, since these can be nested).

I think this could take the same approach as Dart in dealing with embedded
expressions  -
http://code.google.com/p/dart/source/browse/branches/bleeding_edge/dart/compiler/java/com/google/dart/compiler/parser/DartScanner.java?r=1805#898

Basically, the scanner returns token sequences like:

"foo" => STRING(foo)
"foo $bar baz" => STRING_SEGMENT(foo ) STRING_EMBED_EXPR_START
IDENTIFIER(bar) STRING_EMBED_EXPR_END

which the parser then handles normally, with rules like:

string-expression : STRING  | string-interpolation ;
string-interpolation : ( STRING_SEGMENT? embedded-exp? )*
STRING_LAST_SEGMENT ;   // a simplification
embedded-exp : STRING_EMBED_EXP_START expression STRING_EMBED_EXP_END ;

I don't know if this would cause ambiguities in the JS grammar, or if it
would have other issues applying it to quasis in JS, but it keeps a clean
separation between the scanner and parser (it does require some additional
state in the scanner, since these can be nested).

-- 
John A. Tamplin
Software Engineer (GWT), Google
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/b6056319/attachment.html>

# Waldemar Horwat (14 years ago)

OK. This introduces yet another lexing context, in which all productions except QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny. That works if the expressions must be one of the two forms:

$id ${expr}

Is that the exhaustive list, or are we looking at other forms such as $$, $id.id, $id[expr], etc.?

 Waldemar

OK.  This introduces yet another lexing context, in which all productions *except* QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny.  That works if the expressions must be one of the two forms:

$id
${expr}

Is that the exhaustive list, or are we looking at other forms such as $$, $id.id, $id[expr], etc.?

     Waldemar

# Mark S. Miller (14 years ago)

On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com> wrote:

OK. This introduces yet another lexing context, in which all productions except QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny. That works if the expressions must be one of the two forms:

$id ${expr}

Is that the exhaustive list, or are we looking at other forms such as $$, $ id.id, $id[expr], etc.?

I'll let Mike speak for the details of what he really wants to propose. But here are the answers from E:

escapes with the quasi literal text are taken care of by the QuasiChar production, much like the existing definition of DoubleStringCharacter:

QuasiChar ::
    SourceCharacter but not one of $ or `
    $ $
    $ `
    $ \ EscapeSequence

So that $$ === "$", $`` === "", and $\n === "\n", respectively.

Regarding ...$id.id... and ...$id[expr]..., only the first id in each case in in the quasiHole. All the text afterwards is part of the QuasiClose.

On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com> wrote:

> OK.  This introduces yet another lexing context, in which all productions
> *except* QuasiMiddle and QuasiEnd are disallowed, and white space and
> comment handling is funny.  That works if the expressions must be one of
> the two forms:
>
> $id
> ${expr}
>
> Is that the exhaustive list, or are we looking at other forms such as $$, $
> id.id, $id[expr], etc.?
>


I'll let Mike speak for the details of what he really wants to propose. But
here are the answers from E:

escapes with the quasi literal text are taken care of by the QuasiChar
production, much like the existing definition of DoubleStringCharacter:

    QuasiChar ::
        SourceCharacter but not one of $ or `
        $ $
        $ `
        $ \ EscapeSequence

So that `$$` === "$", `$`` === "`", and `$\n` === "\n", respectively.

Regarding `...$id.id...` and `...$id[expr]...`, only the first id in each
case in in the quasiHole. All the text afterwards is part of the QuasiClose.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/14274cbc/attachment.html>

# Mark S. Miller (14 years ago)

On Thu, Feb 2, 2012 at 4:15 PM, Mark S. Miller <erights at google.com> wrote:

On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com>wrote:

OK. This introduces yet another lexing context, in which all productions except QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny. That works if the expressions must be one of the two forms:

$id ${expr}

Is that the exhaustive list, or are we looking at other forms such as $$, $id.id, $id[expr], etc.?

I'll let Mike speak for the details of what he really wants to propose. But here are the answers from E:

escapes with the quasi literal text are taken care of by the QuasiChar production, much like the existing definition of

escapes within ...

On Thu, Feb 2, 2012 at 4:15 PM, Mark S. Miller <erights at google.com> wrote:

>
>
> On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com>wrote:
>
>> OK.  This introduces yet another lexing context, in which all productions
>> *except* QuasiMiddle and QuasiEnd are disallowed, and white space and
>> comment handling is funny.  That works if the expressions must be one of
>> the two forms:
>>
>> $id
>> ${expr}
>>
>> Is that the exhaustive list, or are we looking at other forms such as $$,
>> $id.id, $id[expr], etc.?
>>
>
>
> I'll let Mike speak for the details of what he really wants to propose.
> But here are the answers from E:
>
> escapes with the quasi literal text are taken care of by the QuasiChar
> production, much like the existing definition of
>

escapes *within* ...


> DoubleStringCharacter:
>
>     QuasiChar ::
>         SourceCharacter but not one of $ or `
>         $ $
>         $ `
>         $ \ EscapeSequence
>
> So that `$$` === "$", `$`` === "`", and `$\n` === "\n", respectively.
>
> Regarding `...$id.id...` and `...$id[expr]...`, only the first id in each
> case in in the quasiHole. All the text afterwards is part of the QuasiClose.
>
> --
>     Cheers,
>     --MarkM
>



-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120202/e347030e/attachment.html>

# Waldemar Horwat (14 years ago)

On 02/02/2012 04:15 PM, Mark S. Miller wrote:

On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
OK.  This introduces yet another lexing context, in which all productions *except* QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny.  That works if the expressions must be one of the two forms:

$id
${expr}

Is that the exhaustive list, or are we looking at other forms such as $$, $id.id <http://id.id>, $id[expr], etc.?
I'll let Mike speak for the details of what he really wants to propose. But here are the answers from E:

escapes with the quasi literal text are taken care of by the QuasiChar production, much like the existing definition of DoubleStringCharacter:
 QuasiChar ::
     SourceCharacter but not one of $ or `
     $ $
     $ `
     $ \ EscapeSequence
So that $$ === "$", $`` === "", and $\n === "\n", respectively.

Regarding ...$id.id... and ...$id[expr]..., only the first id in each case in in the quasiHole. All the text afterwards is part of the QuasiClose.

Good. I'll have to think about this a bit more, but there's a chance you converted me.

 Waldemar

On 02/02/2012 04:15 PM, Mark S. Miller wrote:
>
>
> On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
>
>     OK.  This introduces yet another lexing context, in which all productions *except* QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny.  That works if the expressions must be one of the two forms:
>
>     $id
>     ${expr}
>
>     Is that the exhaustive list, or are we looking at other forms such as $$, $id.id <http://id.id>, $id[expr], etc.?
>
>
>
> I'll let Mike speak for the details of what he really wants to propose. But here are the answers from E:
>
> escapes with the quasi literal text are taken care of by the QuasiChar production, much like the existing definition of DoubleStringCharacter:
>
>      QuasiChar ::
>          SourceCharacter but not one of $ or `
>          $ $
>          $ `
>          $ \ EscapeSequence
>
> So that `$$` === "$", `$`` === "`", and `$\n` === "\n", respectively.
>
> Regarding `...$id.id...` and `...$id[expr]...`, only the first id in each case in in the quasiHole. All the text afterwards is part of the QuasiClose.

Good.  I'll have to think about this a bit more, but there's a chance you converted me.

     Waldemar

# Waldemar Horwat (14 years ago)

On 02/02/2012 06:27 PM, Waldemar Horwat wrote:

On 02/02/2012 04:15 PM, Mark S. Miller wrote:

On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:

OK. This introduces yet another lexing context, in which all productions except QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny. That works if the expressions must be one of the two forms:

$id ${expr}

Is that the exhaustive list, or are we looking at other forms such as $$, $id.id, id.id, $id[expr], etc.?

I'll let Mike speak for the details of what he really wants to propose. But here are the answers from E:

escapes with the quasi literal text are taken care of by the QuasiChar production, much like the existing definition of DoubleStringCharacter:

QuasiChar :: SourceCharacter but not one of $ or $ $ $ $ \ EscapeSequence

So that $$ === "$", $`` === "", and $\n === "\n", respectively.

Regarding ...$id.id... and ...$id[expr]..., only the first id in each case in in the quasiHole. All the text afterwards is part of the QuasiClose.

Good. I'll have to think about this a bit more, but there's a chance you converted me.

Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp. Here comments and white space are also affected, which can in turn the structure of the lexer upside down. The kinds of cases I'm thinking of are:

abc$/*comment*/identifier// (here we have a /**/ comment and a // comment)

abc$/**/{/**//re//**/}/**/def vs: abc$/**/{/**//re//**/}/*def (in the former all four "/**/"'s are comments. Not sure what the latter would do.)

abc$id def abc$ id def (the lexer removes spaces before all tokens, so the quasi would not contain a space before the "def")

 Waldemar

On 02/02/2012 06:27 PM, Waldemar Horwat wrote:
> On 02/02/2012 04:15 PM, Mark S. Miller wrote:
>>
>>
>> On Thu, Feb 2, 2012 at 2:00 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
>>
>> OK. This introduces yet another lexing context, in which all productions *except* QuasiMiddle and QuasiEnd are disallowed, and white space and comment handling is funny. That works if the expressions must be one of the two forms:
>>
>> $id
>> ${expr}
>>
>> Is that the exhaustive list, or are we looking at other forms such as $$, $id.id <http://id.id>, $id[expr], etc.?
>>
>>
>>
>> I'll let Mike speak for the details of what he really wants to propose. But here are the answers from E:
>>
>> escapes with the quasi literal text are taken care of by the QuasiChar production, much like the existing definition of DoubleStringCharacter:
>>
>> QuasiChar ::
>> SourceCharacter but not one of $ or `
>> $ $
>> $ `
>> $ \ EscapeSequence
>>
>> So that `$$` === "$", `$`` === "`", and `$\n` === "\n", respectively.
>>
>> Regarding `...$id.id...` and `...$id[expr]...`, only the first id in each case in in the quasiHole. All the text afterwards is part of the QuasiClose.
>
> Good. I'll have to think about this a bit more, but there's a chance you converted me.

Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp.  Here comments and white space are also affected, which can in turn the structure of the lexer upside down.  The kinds of cases I'm thinking of are:

`abc$/*comment*/identifier//
`
(here we have a /**/ comment and a // comment)

`abc$/**/{/**//re//**/}/**/def`
vs:
`abc$/**/{/**//re//**/}/*def`
(in the former all four "/**/"'s are comments.  Not sure what the latter would do.)

`abc$id def`
`abc$ id def`
(the lexer removes spaces before all tokens, so the quasi would not contain a space before the "def")

     Waldemar

# Mark S. Miller (14 years ago)

On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com>wrote:

On 02/02/2012 06:27 PM, Waldemar Horwat wrote:

[...]

Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp. Here comments and white space are also affected, which can in turn the structure of the lexer upside down. The kinds of cases I'm thinking of are:

abc$/*comment*/identifier// (here we have a /**/ comment and a // comment)

There is no valid quasiHole above, so the whole thing matches a QuasiOnly. The QuasiOnly includes all characters between the backticks. Nothing is taken to be a comment, just like it wouldn't be if it appeared within a string.

abc$/**/{/**//re//**/}/**/**def vs: abc$/**/{/**//re//**/}/*def (in the former all four "/**/"'s are comments. Not sure what the latter would do.)

Same thing. There is no valid quasiHole here.

abc$id def abc$ id def (the lexer removes spaces before all tokens, so the quasi would not contain a space before the "def")

The first has a valid quasiHole, and so would parse as QuasiOpen("abc"), Identifier("id"), QuasiClose(" def"). (Note space captured in the QuasiClose text.)

The second has no valid quasiHole, and so the whole thing would again parse as a QuasiOnly.

I think of, for example, QuasiMiddle as being much like DoubleStringChars. Once you're lexing that, all spaces are significant. As a lexing context, I don't really see how quasis are weirder than strings.

However, from your example, I think I see what you're getting at. I forgot to state that a quasiExpr is only started if a QuasiOpen or QuasiMiddle ends with an (unescaped by previous $\ ) $ followed immediately, with no intervening characters, by either an Identifier or a "{". I don't see this as weirder than having a string terminate by (\ ") but not by ("). The " is only processed specially if it comes immediately after an (unescaped by previous \ ) \ .

Similarly, you go back into quasi context immediately following the identifier or matching } respectively, i.e., exactly when the quasiHole production is over.

On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com>wrote:

> On 02/02/2012 06:27 PM, Waldemar Horwat wrote:
>
[...]

> Note that this is more complex than just having the parser switch modes
> for the treatment of / as division vs. regexp.  Here comments and white
> space are also affected, which can in turn the structure of the lexer
> upside down.  The kinds of cases I'm thinking of are:
>
> `abc$/*comment*/identifier//
> `
> (here we have a /**/ comment and a // comment)
>

There is no valid quasiHole above, so the whole thing matches a QuasiOnly.
The QuasiOnly includes all characters between the backticks. Nothing is
taken to be a comment, just like it wouldn't be if it appeared within a
string.

> `abc$/**/{/**//re//**/}/**/**def`
> vs:
> `abc$/**/{/**//re//**/}/*def`
> (in the former all four "/**/"'s are comments.  Not sure what the latter
> would do.)
>

Same thing. There is no valid quasiHole here.

>
> `abc$id def`
> `abc$ id def`
> (the lexer removes spaces before all tokens, so the quasi would not
> contain a space before the "def")

The first has a valid quasiHole, and so would parse as QuasiOpen("abc"),
Identifier("id"), QuasiClose(" def"). (Note space captured in the
QuasiClose text.)

The second has no valid quasiHole, and so the whole thing would again parse
as a QuasiOnly.

I think of, for example, QuasiMiddle as being much like DoubleStringChars.
Once you're lexing that, all spaces are significant. As a lexing context, I
don't really see how quasis are weirder than strings.

However, from your example, I think I see what you're getting at. I forgot
to state that a quasiExpr is only started if a QuasiOpen or QuasiMiddle
ends with an (unescaped by previous $\ ) $ followed *immediately*, with no
intervening characters, by either an Identifier or a "{". I don't see this
as weirder than having a string terminate by (\ ") but not by (\"). The "
is only processed specially if it comes immediately after an (unescaped by
previous \ ) \ .

Similarly, you go back into quasi context *immediately* following the
identifier or matching } respectively, i.e., exactly when the quasiHole
production is over.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120203/d15ddba9/attachment.html>

# Waldemar Horwat (14 years ago)

On 02/03/2012 08:07 PM, Mark S. Miller wrote:

On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
On 02/02/2012 06:27 PM, Waldemar Horwat wrote:
[...]
Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp.  Here comments and white space are also affected, which can in turn the structure of the lexer upside down.  The kinds of cases I'm thinking of are:

`abc$/*comment*/identifier//
`
(here we have a /**/ comment and a // comment)
There is no valid quasiHole above, so the whole thing matches a QuasiOnly. The QuasiOnly includes all characters between the backticks. Nothing is taken to be a comment, just like it wouldn't be if it appeared within a string.

According to which lexical grammar? According to the one you provided earlier in this thread, `abc$ is a QuasiOpen token:

QuasiOpen :: ` QuasiChar* $

Parsing further, /comment/identifier is a single identifier token as far as the syntactic grammar is concerned.

 Waldemar

On 02/03/2012 08:07 PM, Mark S. Miller wrote:
> On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
>
>     On 02/02/2012 06:27 PM, Waldemar Horwat wrote:
>
> [...]
>
>     Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp.  Here comments and white space are also affected, which can in turn the structure of the lexer upside down.  The kinds of cases I'm thinking of are:
>
>     `abc$/*comment*/identifier//
>     `
>     (here we have a /**/ comment and a // comment)
>
>
> There is no valid quasiHole above, so the whole thing matches a QuasiOnly. The QuasiOnly includes all characters between the backticks. Nothing is taken to be a comment, just like it wouldn't be if it appeared within a string.

According to which lexical grammar?  According to the one you provided earlier in this thread, `abc$ is a QuasiOpen token:

   QuasiOpen ::
         ` QuasiChar* $


Parsing further, /*comment*/identifier is a single identifier token as far as the syntactic grammar is concerned.

     Waldemar

# Mark S. Miller (14 years ago)

On Mon, Feb 6, 2012 at 3:26 PM, Waldemar Horwat <waldemar at google.com> wrote:

On 02/03/2012 08:07 PM, Mark S. Miller wrote:

On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com<mailto:

waldemar at google.com>> wrote:

On 02/02/2012 06:27 PM, Waldemar Horwat wrote:

[...]

Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp. Here comments and white space are also affected, which can in turn the structure of the lexer upside down. The kinds of cases I'm thinking of are:

abc$/*comment*/identifier// (here we have a /**/ comment and a // comment)

There is no valid quasiHole above, so the whole thing matches a QuasiOnly. The QuasiOnly includes all characters between the backticks. Nothing is taken to be a comment, just like it wouldn't be if it appeared within a string.

According to which lexical grammar? According to the one you provided earlier in this thread, `abc$ is a QuasiOpen token:

QuasiOpen :: ` QuasiChar* $

Parsing further, /comment/identifier is a single identifier token as far as the syntactic grammar is concerned.

I was imprecise. I'll try again, using only lexical grammar concepts and making explicit where whitespace, comments, etc may appear.

Token ::
    IdentifierName
    Punctuator
    NumericLiteral
    StringLiteral
    Quasi

Quasi ::
    QuasiOnly
    QuasiOpen QuasiHole (QuasiMiddle QuasiHole)* QuasiClose

QuasiOnly ::
    ` QuasiChar* `

QuasiOpen ::
    ` QuasiChar* $

QuasiMiddle ::
    QuasiChar* $

QuasiEnd ::
    QuasiChar* `

QuasiChar ::
    SourceCharacter *but not one of $ or `*
    $ $
    $ `
    $ \ EscapeSequence

QuasiHole ::
    Identifier
    { Spacing* (BalancedCurlySequence Spacing*)* }

BalancedCurlySequence ::
    Token *but not one of { or }*
    { Spacing* (BalancedCurlySequence Spacing*)* }

Spacing ::
    WhiteSpace
    LineTerminator
    Comment

Within a Quasi, no character sequences are interpreted as whitespace or comments except where indicated by Spacing above.

On Mon, Feb 6, 2012 at 3:26 PM, Waldemar Horwat <waldemar at google.com> wrote:

> On 02/03/2012 08:07 PM, Mark S. Miller wrote:
>
>  On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com<mailto:
>> waldemar at google.com>> wrote:
>>
>>    On 02/02/2012 06:27 PM, Waldemar Horwat wrote:
>>
>> [...]
>>
>>    Note that this is more complex than just having the parser switch
>> modes for the treatment of / as division vs. regexp.  Here comments and
>> white space are also affected, which can in turn the structure of the lexer
>> upside down.  The kinds of cases I'm thinking of are:
>>
>>    `abc$/*comment*/identifier//
>>    `
>>    (here we have a /**/ comment and a // comment)
>>
>>
>> There is no valid quasiHole above, so the whole thing matches a
>> QuasiOnly. The QuasiOnly includes all characters between the backticks.
>> Nothing is taken to be a comment, just like it wouldn't be if it appeared
>> within a string.
>>
>
> According to which lexical grammar?  According to the one you provided
> earlier in this thread, `abc$ is a QuasiOpen token:
>
>  QuasiOpen ::
>        ` QuasiChar* $
>
>
> Parsing further, /*comment*/identifier is a single identifier token as far
> as the syntactic grammar is concerned.


I was imprecise. I'll try again, using only lexical grammar concepts and
making explicit where whitespace, comments, etc may appear.

    Token ::
        IdentifierName
        Punctuator
        NumericLiteral
        StringLiteral
        Quasi

    Quasi ::
        QuasiOnly
        QuasiOpen QuasiHole (QuasiMiddle QuasiHole)* QuasiClose

    QuasiOnly ::
        ` QuasiChar* `

    QuasiOpen ::
        ` QuasiChar* $

    QuasiMiddle ::
        QuasiChar* $

    QuasiEnd ::
        QuasiChar* `

    QuasiChar ::
        SourceCharacter *but not one of $ or `*
        $ $
        $ `
        $ \ EscapeSequence

    QuasiHole ::
        Identifier
        { Spacing* (BalancedCurlySequence Spacing*)* }

    BalancedCurlySequence ::
        Token *but not one of { or }*
        { Spacing* (BalancedCurlySequence Spacing*)* }

    Spacing ::
        WhiteSpace
        LineTerminator
        Comment

Within a Quasi, no character sequences are interpreted as whitespace or
comments except where indicated by Spacing above.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120206/ef5dc846/attachment.html>

# Erik Arvidsson (14 years ago)

On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:

QuasiHole :: Identifier { Spacing* (BalancedCurlySequence Spacing*)* }

If you replace that with:

QuasiHole :: Identifier { Spacing* Expression Spacing* }

You can now support nested quasis

Your grammar allows abc{ }def. Was that intentional?

On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:
>     QuasiHole ::
>         Identifier
>         { Spacing* (BalancedCurlySequence Spacing*)* }
>

If you replace that with:

    QuasiHole ::
        Identifier
        { Spacing* Expression Spacing* }

You can now support nested quasis

Your grammar allows `abc{ }def`. Was that intentional?

-- 
erik

# Erik Arvidsson (14 years ago)

On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:

QuasiChar :: SourceCharacter but not one of $ or $ $ $ $ \ EscapeSequence

This part was never in the Quasi proposal. abc$$def is the same as abc\${def} according to the current proposal.

On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:
>     QuasiChar ::
>         SourceCharacter but not one of $ or `
>         $ $
>         $ `
>         $ \ EscapeSequence

This part was never in the Quasi proposal. `abc$$def` is the same as
`abc\${def}` according to the current proposal.

-- 
erik

# Waldemar Horwat (14 years ago)

On 02/06/2012 06:49 PM, Mark S. Miller wrote:

On Mon, Feb 6, 2012 at 3:26 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:

On 02/03/2012 08:07 PM, Mark S. Miller wrote:

    On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com> <mailto:waldemar at google.com <mailto:waldemar at google.com>>> wrote:

        On 02/02/2012 06:27 PM, Waldemar Horwat wrote:

    [...]

        Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp.  Here comments and white space are also affected, which can in turn the structure of the lexer upside down.  The kinds of cases I'm thinking of are:

        `abc$/*comment*/identifier//
        `
        (here we have a /**/ comment and a // comment)


    There is no valid quasiHole above, so the whole thing matches a QuasiOnly. The QuasiOnly includes all characters between the backticks. Nothing is taken to be a comment, just like it wouldn't be if it appeared within a string.


According to which lexical grammar?  According to the one you provided earlier in this thread, `abc$ is a QuasiOpen token:

  QuasiOpen ::
        ` QuasiChar* $


Parsing further, /*comment*/identifier is a single identifier token as far as the syntactic grammar is concerned.

I was imprecise. I'll try again, using only lexical grammar concepts and making explicit where whitespace, comments, etc may appear.

 Token ::
     IdentifierName
     Punctuator
     NumericLiteral
     StringLiteral
     Quasi

 Quasi ::
     QuasiOnly
     QuasiOpen QuasiHole (QuasiMiddle QuasiHole)* QuasiClose

 QuasiOnly ::
     ` QuasiChar* `

 QuasiOpen ::
     ` QuasiChar* $

 QuasiMiddle ::
     QuasiChar* $

 QuasiEnd ::
     QuasiChar* `

 QuasiChar ::
     SourceCharacter *but not one of $ or `*
     $ $
     $ `
     $ \ EscapeSequence

 QuasiHole ::
     Identifier
     { Spacing* (BalancedCurlySequence Spacing*)* }

 BalancedCurlySequence ::
     Token *but not one of { or }*
     { Spacing* (BalancedCurlySequence Spacing*)* }

 Spacing ::
     WhiteSpace
     LineTerminator
     Comment

Within a Quasi, no character sequences are interpreted as whitespace or comments except where indicated by Spacing above.

That's going back to the previous approach of treating the whole quasi as a single token. This doesn't work because it's not possible to specify the BalancedCurlySequence production as a lexical grammar. You're confusing the lexical with the syntactic grammars here.

Examples of why BalancedCurlySequence doesn't work:

{/[{]/} (interior parses as five single-character tokens but no matching closing bracket)

{ainb} (interior parses as three tokens: a in b)

{3.toString()} (interior parses as 3 . toString ( ))

 Waldemar

On 02/06/2012 06:49 PM, Mark S. Miller wrote:
> On Mon, Feb 6, 2012 at 3:26 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
>
>     On 02/03/2012 08:07 PM, Mark S. Miller wrote:
>
>         On Fri, Feb 3, 2012 at 12:58 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com> <mailto:waldemar at google.com <mailto:waldemar at google.com>>> wrote:
>
>             On 02/02/2012 06:27 PM, Waldemar Horwat wrote:
>
>         [...]
>
>             Note that this is more complex than just having the parser switch modes for the treatment of / as division vs. regexp.  Here comments and white space are also affected, which can in turn the structure of the lexer upside down.  The kinds of cases I'm thinking of are:
>
>             `abc$/*comment*/identifier//
>             `
>             (here we have a /**/ comment and a // comment)
>
>
>         There is no valid quasiHole above, so the whole thing matches a QuasiOnly. The QuasiOnly includes all characters between the backticks. Nothing is taken to be a comment, just like it wouldn't be if it appeared within a string.
>
>
>     According to which lexical grammar?  According to the one you provided earlier in this thread, `abc$ is a QuasiOpen token:
>
>       QuasiOpen ::
>             ` QuasiChar* $
>
>
>     Parsing further, /*comment*/identifier is a single identifier token as far as the syntactic grammar is concerned.
>
>
> I was imprecise. I'll try again, using only lexical grammar concepts and making explicit where whitespace, comments, etc may appear.
>
>      Token ::
>          IdentifierName
>          Punctuator
>          NumericLiteral
>          StringLiteral
>          Quasi
>
>      Quasi ::
>          QuasiOnly
>          QuasiOpen QuasiHole (QuasiMiddle QuasiHole)* QuasiClose
>
>      QuasiOnly ::
>          ` QuasiChar* `
>
>      QuasiOpen ::
>          ` QuasiChar* $
>
>      QuasiMiddle ::
>          QuasiChar* $
>
>      QuasiEnd ::
>          QuasiChar* `
>
>      QuasiChar ::
>          SourceCharacter *but not one of $ or `*
>          $ $
>          $ `
>          $ \ EscapeSequence
>
>      QuasiHole ::
>          Identifier
>          { Spacing* (BalancedCurlySequence Spacing*)* }
>
>      BalancedCurlySequence ::
>          Token *but not one of { or }*
>          { Spacing* (BalancedCurlySequence Spacing*)* }
>
>      Spacing ::
>          WhiteSpace
>          LineTerminator
>          Comment
>
> Within a Quasi, no character sequences are interpreted as whitespace or comments except where indicated by Spacing above.

That's going back to the previous approach of treating the whole quasi as a single token.  This doesn't work because it's not possible to specify the BalancedCurlySequence production as a lexical grammar.  You're confusing the lexical with the syntactic grammars here.

Examples of why BalancedCurlySequence doesn't work:

{/[{]/}
(interior parses as five single-character tokens but no matching closing bracket)

{ainb}
(interior parses as three tokens: a in b)

{3.toString()}
(interior parses as 3 . toString ( ))

     Waldemar

# Erik Arvidsson (14 years ago)

Correction...

This part was never in the Quasi proposal. abc$$def is the same as abc\${def} according to the current proposal.

abc\$${def}

The point is that only $ident and ${...} are special. In all other contexts, $ is a normal character.

Correction...

> This part was never in the Quasi proposal. `abc$$def` is the same as
> `abc\${def}` according to the current proposal.

`abc\$${def}`

The point is that only $ident and ${...} are special. In all other
contexts, $ is a normal character.

-- 
erik

# Mark S. Miller (14 years ago)

On Tue, Feb 7, 2012 at 9:48 AM, Erik Arvidsson <erik.arvidsson at gmail.com>wrote:

On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:
QuasiHole ::
    Identifier
    { Spacing* (BalancedCurlySequence Spacing*)* }
If you replace that with:
QuasiHole ::
    Identifier
    { Spacing* Expression Spacing* }
You can now support nested quasis

Your grammar allows abc{ }def. Was that intentional?

Hi Erik, it was not my intention. Your grammar does better capture my intention, and is approximately what I specified on Feb 2 before Waldemar's question about spaces and comments. If there's no objection to your way of mixing the lexical and parsing issues, and if it succeeds at avoiding the spacing and comment placement issues Waldemar raises (I think it does), I think that's superior to the approach I was taking. Thanks.

On Tue, Feb 7, 2012 at 9:48 AM, Erik Arvidsson <erik.arvidsson at gmail.com>wrote:

> On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:
> >     QuasiHole ::
> >         Identifier
> >         { Spacing* (BalancedCurlySequence Spacing*)* }
> >
>
> If you replace that with:
>
>     QuasiHole ::
>         Identifier
>         { Spacing* Expression Spacing* }
>
> You can now support nested quasis
>
> Your grammar allows `abc{ }def`. Was that intentional?
>

Hi Erik, it was not my intention. Your grammar does better capture my
intention, and is approximately what I specified on Feb 2 before Waldemar's
question about spaces and comments. If there's no objection to your way of
mixing the lexical and parsing issues, and if it succeeds at avoiding the
spacing and comment placement issues Waldemar raises (I think it does), I
think that's superior to the approach I was taking. Thanks.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120207/e292bf5c/attachment.html>

# Mark S. Miller (14 years ago)

To reiterate, my posts here are expository, to explain how I approached these same matters in E, to see if that helps resolve any remaining controversy. Regarding what we're actually proposing, I'll let Mike speak for that.

To reiterate, my posts here are expository, to explain how I approached
these same matters in E, to see if that helps resolve any remaining
controversy. Regarding what we're actually proposing, I'll let Mike speak
for that.

On Tue, Feb 7, 2012 at 9:56 AM, Erik Arvidsson <erik.arvidsson at gmail.com>wrote:

> On Mon, Feb 6, 2012 at 18:49, Mark S. Miller <erights at google.com> wrote:
> >     QuasiChar ::
> >         SourceCharacter but not one of $ or `
> >         $ $
> >         $ `
> >         $ \ EscapeSequence
>
> This part was never in the Quasi proposal. `abc$$def` is the same as
> `abc\${def}` according to the current proposal.
>
> --
> erik
>

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120207/20e4b250/attachment.html>

# Mark S. Miller (14 years ago)

On Tue, Feb 7, 2012 at 1:52 PM, Waldemar Horwat <waldemar at google.com> wrote: [...]

That's going back to the previous approach of treating the whole quasi as a single token. This doesn't work because it's not possible to specify the BalancedCurlySequence production as a lexical grammar. You're confusing the lexical with the syntactic grammars here.

Hi Waldemar, I am first of all trying to make clear what we're actually proposing, and to resolve any genuine ambiguity. As for how we phrase this proposal so that it fits with the rest of our spec language, what do you suggest?

Examples of why BalancedCurlySequence doesn't work:

{/[{]/} (interior parses as five single-character tokens but no matching closing bracket)

Yes, and therefore a program consisting of

`{/[{]/}`

fails to lex and fails to parse. That seems like the correct outcome.

{ainb} (interior parses as three tokens: a in b)

Why doesn't it parse as one token: ainb ?

{3.toString()} (interior parses as 3 . toString ( ))

Why? That's not what the JS lexer does anywhere else?

I don't at all see how you arrived at your conclusions. Is it actually unclear what I am trying to say, or are you simply taking issue with how I'm saying it? If you find Erik's way of specifying ok, let's just use that. As I just said in reply to him, it does capture my actual intent more directly.

On Tue, Feb 7, 2012 at 1:52 PM, Waldemar Horwat <waldemar at google.com> wrote:
[...]

> That's going back to the previous approach of treating the whole quasi as
> a single token.  This doesn't work because it's not possible to specify the
> BalancedCurlySequence production as a lexical grammar.  You're confusing
> the lexical with the syntactic grammars here.
>

Hi Waldemar, I am first of all trying to make clear what we're actually
proposing, and to resolve any genuine ambiguity. As for how we phrase this
proposal so that it fits with the rest of our spec language, what do you
suggest?

>
> Examples of why BalancedCurlySequence doesn't work:
>
> {/[{]/}
> (interior parses as five single-character tokens but no matching closing
> bracket)
>

Yes, and therefore a program consisting of

    `{/[{]/}`

fails to lex and fails to parse. That seems like the correct outcome.

> {ainb}
> (interior parses as three tokens: a in b)
>

Why doesn't it parse as one token: ainb ?

>
> {3.toString()}
> (interior parses as 3 . toString ( ))

Why? That's not what the JS lexer does anywhere else?

I don't at all see how you arrived at your conclusions. Is it actually
unclear what I am trying to say, or are you simply taking issue with how
I'm saying it? If you find Erik's way of specifying ok, let's just use
that. As I just said in reply to him, it does capture my actual intent more
directly.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120207/ae767bc3/attachment-0001.html>

# Waldemar Horwat (14 years ago)

On 02/07/2012 02:51 PM, Mark S. Miller wrote:

On Tue, Feb 7, 2012 at 1:52 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote: [...]
That's going back to the previous approach of treating the whole quasi as a single token.  This doesn't work because it's not possible to specify the BalancedCurlySequence production as a lexical grammar.  You're confusing the lexical with the syntactic grammars here.
Hi Waldemar, I am first of all trying to make clear what we're actually proposing, and to resolve any genuine ambiguity. As for how we phrase this proposal so that it fits with the rest of our spec language, what do you suggest?
Examples of why BalancedCurlySequence doesn't work:

{/[{]/}
(interior parses as five single-character tokens but no matching closing bracket)
Yes, and therefore a program consisting of
 `{/[{]/}`
fails to lex and fails to parse. That seems like the correct outcome.

Why? It's just a regexp.

{ainb}
(interior parses as three tokens: a in b)
Why doesn't it parse as one token: ainb ?

The point is that a in b is one valid parse. I don't need to show that there are no other valid parses. In fact, there are lots of other valid parses because the grammar is very ambiguous.

{3.toString()}
(interior parses as 3 . toString ( ))
Why? That's not what the JS lexer does anywhere else?

That's the problem with the rule you gave.

I don't at all see how you arrived at your conclusions. Is it actually unclear what I am trying to say, or are you simply taking issue with how I'm saying it? If you find Erik's way of specifying ok, let's just use that. As I just said in reply to him, it does capture my actual intent more directly.

The bug is in what you're trying to say, not in how you're saying it. You're confusing the lexical and syntactic grammars. Due to this confusion you're trying lexical productions such as

BalancedCurlySequence :: Token but not one of { or } { Spacing* (BalancedCurlySequence Spacing*)* }

To illustrate the problem, consider a simpler lexer rule:

TokenSequence :: Token*

This will lex ainb as many things, including for example a in b. The existing lexer resolves it by always chomping the largest sequence of characters to bite off as the next lexical token. Once it accepts a token, it doesn't backtrack if it later finds an alternative parse for that token that would have made future tokens work better. On the other hand, if you allow productions such as a TokenSequence inside a lexical token, then you get full backtracking and ambiguity across the Tokens that make up the TokenSequence because they are all part of one lexical token.

I was favorable to splitting up a quasi into multiple tokens, where this problem for the most part doesn't arise. If you want to make the whole quasi into one token, then you'll need to solve this problem.

 Waldemar

On 02/07/2012 02:51 PM, Mark S. Miller wrote:
> On Tue, Feb 7, 2012 at 1:52 PM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
> [...]
>
>     That's going back to the previous approach of treating the whole quasi as a single token.  This doesn't work because it's not possible to specify the BalancedCurlySequence production as a lexical grammar.  You're confusing the lexical with the syntactic grammars here.
>
>
> Hi Waldemar, I am first of all trying to make clear what we're actually proposing, and to resolve any genuine ambiguity. As for how we phrase this proposal so that it fits with the rest of our spec language, what do you suggest?
>
>
>     Examples of why BalancedCurlySequence doesn't work:
>
>     {/[{]/}
>     (interior parses as five single-character tokens but no matching closing bracket)
>
>
> Yes, and therefore a program consisting of
>
>      `{/[{]/}`
>
> fails to lex and fails to parse. That seems like the correct outcome.

Why?  It's just a regexp.

>     {ainb}
>     (interior parses as three tokens: a in b)
>
> Why doesn't it parse as one token: ainb ?

The point is that a in b is one valid parse.  I don't need to show that there are no other valid parses.  In fact, there are lots of other valid parses because the grammar is very ambiguous.

>     {3.toString()}
>     (interior parses as 3 . toString ( ))
>
> Why? That's not what the JS lexer does anywhere else?

That's the problem with the rule you gave.

> I don't at all see how you arrived at your conclusions. Is it actually unclear what I am trying to say, or are you simply taking issue with how I'm saying it? If you find Erik's way of specifying ok, let's just use that. As I just said in reply to him, it does capture my actual intent more directly.

The bug is in what you're trying to say, not in how you're saying it.  You're confusing the lexical and syntactic grammars.  Due to this confusion you're trying lexical productions such as

BalancedCurlySequence ::
     Token *but not one of { or }*
     { Spacing* (BalancedCurlySequence Spacing*)* }

To illustrate the problem, consider a simpler lexer rule:

TokenSequence ::
   Token*

This will lex ainb as many things, including for example a in b.  The existing lexer resolves it by always chomping the largest sequence of characters to bite off as the next lexical token.  Once it accepts a token, it doesn't backtrack if it later finds an alternative parse for that token that would have made future tokens work better.  On the other hand, if you allow productions such as a TokenSequence inside a lexical token, then you get full backtracking and ambiguity across the Tokens that make up the TokenSequence because they are all part of one lexical token.

I was favorable to splitting up a quasi into multiple tokens, where this problem for the most part doesn't arise.  If you want to make the whole quasi into one token, then you'll need to solve this problem.

     Waldemar

# Mark S. Miller (14 years ago)

On Tue, Feb 7, 2012 at 3:47 PM, Waldemar Horwat <waldemar at google.com> wrote: [...]

To illustrate the problem, consider a simpler lexer rule:

TokenSequence :: Token*

This will lex ainb as many things, including for example a in b.

I now understand your objection. Rather than trying to repair my way of saying this, do you find Erik's approach clear? If so, let's just start there. It does correspond exactly to what I've been trying to explain.

On Tue, Feb 7, 2012 at 3:47 PM, Waldemar Horwat <waldemar at google.com> wrote:
[...]

> To illustrate the problem, consider a simpler lexer rule:
>
> TokenSequence ::
>  Token*
>
> This will lex ainb as many things, including for example a in b.

I now understand your objection. Rather than trying to repair my way of
saying this, do you find Erik's approach clear? If so, let's just start
there. It does correspond exactly to what I've been trying to explain.

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120207/22cef5e8/attachment.html>

# Brendan Eich (14 years ago)

I like Erik's way, but it makes a strange loop from lexical to syntactic grammar. It all works, I believe.

The loop is here:

 QuasiHole ::
     Identifier
     { Spacing* Expression Spacing* }

Expression is a syntactic grammar non-terminal, yet here we are in a lexical production.

Waldemar, is this sound?

I like Erik's way, but it makes a strange loop from lexical to syntactic 
grammar. It all works, I believe.

The loop is here:

     QuasiHole ::
         Identifier
         { Spacing* Expression Spacing* }


Expression is a syntactic grammar non-terminal, yet here we are in a 
lexical production.

Waldemar, is this sound?

/be

Mark S. Miller wrote:
> On Tue, Feb 7, 2012 at 3:47 PM, Waldemar Horwat <waldemar at google.com 
> <mailto:waldemar at google.com>> wrote:
> [...]
>
>     To illustrate the problem, consider a simpler lexer rule:
>
>     TokenSequence ::
>      Token*
>
>     This will lex ainb as many things, including for example a in b.
>
>
> I now understand your objection. Rather than trying to repair my way 
> of saying this, do you find Erik's approach clear? If so, let's just 
> start there. It does correspond exactly to what I've been trying to 
> explain.
>
> -- 
>     Cheers,
>     --MarkM
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss

# Waldemar Horwat (14 years ago)

On 02/07/2012 04:40 PM, Brendan Eich wrote:

I like Erik's way, but it makes a strange loop from lexical to syntactic grammar. It all works, I believe.

The loop is here:

QuasiHole :: Identifier { Spacing* Expression Spacing* }

Expression is a syntactic grammar non-terminal, yet here we are in a lexical production.

Waldemar, is this sound?

QuasiHole is a syntactic production, not a lexical one. See Mark's grammar in his 02/02/2012 11:03 AM message.

I believe that it works, except for the treatment of comments and whitespace along the boundary of a QuasiHole. I recently gave some examples of the mischief those can create unless we can figure out what to do about them.

 Waldemar

On 02/07/2012 04:40 PM, Brendan Eich wrote:
> I like Erik's way, but it makes a strange loop from lexical to syntactic grammar. It all works, I believe.
>
> The loop is here:
>
> QuasiHole ::
> Identifier
> { Spacing* Expression Spacing* }
>
>
> Expression is a syntactic grammar non-terminal, yet here we are in a lexical production.
>
> Waldemar, is this sound?

QuasiHole is a syntactic production, not a lexical one.  See Mark's grammar in his 02/02/2012 11:03 AM message.

I believe that it works, except for the treatment of comments and whitespace along the boundary of a QuasiHole.  I recently gave some examples of the mischief those can create unless we can figure out what to do about them.

     Waldemar

# Mark S. Miller (14 years ago)

I believe we have all figured out what to do about them and agree on the same answer. We're only struggling to find a way to state the answer.

If you understand what we're trying to say, please suggest a way to say it you would find acceptable. If you don't understand, can we proceed by example until you understand our intent, so that we can then proceed to discuss how to say it?

I believe we have all figured out what to do about them and agree on the
same answer. We're only struggling to find a way to state the answer.

If you understand what we're trying to say, please suggest a way to say it
you would find acceptable. If you don't understand, can we proceed by
example until you understand our intent, so that we can then proceed to
discuss how to say it?

On Wed, Feb 8, 2012 at 1:48 PM, Waldemar Horwat <waldemar at google.com> wrote:

> On 02/07/2012 04:40 PM, Brendan Eich wrote:
>
>> I like Erik's way, but it makes a strange loop from lexical to syntactic
>> grammar. It all works, I believe.
>>
>> The loop is here:
>>
>> QuasiHole ::
>> Identifier
>> { Spacing* Expression Spacing* }
>>
>>
>> Expression is a syntactic grammar non-terminal, yet here we are in a
>> lexical production.
>>
>> Waldemar, is this sound?
>>
>
> QuasiHole is a syntactic production, not a lexical one.  See Mark's
> grammar in his 02/02/2012 11:03 AM message.
>
> I believe that it works, except for the treatment of comments and
> whitespace along the boundary of a QuasiHole.  I recently gave some
> examples of the mischief those can create unless we can figure out what to
> do about them.
>
>    Waldemar
>

-- 
    Cheers,
    --MarkM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120208/40a2d500/attachment.html>