Backward running version look-behinds
I can speak only for myself. I like the .Net-style lookbehinds, and I hope they will be part of the standard. For something to be in the standard we need both implementations and someone to describe the desired behaviour in the standards document. It looks like implementations are being found. Hopefully someone can write the document so this can move forward.
Le 20 nov. 2015 à 15:41, Nozomu Katō <noz.ka at akenotsuki.com> a écrit :
I was expecting that ES6 would come with look-behinds, because a proposal had been put at: web.archive.org/web/20121114071428/http://wiki.ecmascript.org/doku.php?id=harmony:proposals
However, ES6 does not support them. I noticed that the link to the proposal had been struck-through: web.archive.org/web/20150812143714/http://wiki.ecmascript.org/doku.php?id=harmony:proposals
I wondered what was a problem. I did research to know the situation about look-behinds, and I found this post: esdiscuss/2013-October/033911
I realised that a spec needed to be written by someone, but "someone" had not appeared yet. Thus, I wrote a spec, subscribed to es-discuss, and posted the spec. What made me decide to post that spec was this post and thread.
But now, it turns out that look-behinds similar to the proposal that has been struck-through have been implemented experimentally in Chromium and Gecko. I am confused about the ongoing situation.
I am NOT an objector against .NET-compatible look-behinds. But I wonder if there is someone who writes a spec for them. I have no idea how the behaviours of look-behinds based on the .NET implementation are described in the language used by the ECMAScript spec. Introducing an internal direction switch might be a relatively simple way, but I have no concrete idea even about it.
Nozomu
I've amended the spec in order to add .NET-style lookbehinds. It proved to be indeed relatively simple, once you get how it works. Here is the result (with diffs):
claudepache.github.io/ecma262/#sec-pattern, claudepache.github.io/ecma262/#sec-pattern
The most difficult part was to manage to output the token <!
in ecmarkdown :
Lookbehind assertions were discussed at the TC39 meeting last week and the committee is in favor of the .NET-style version.
Gorkem
From: es-discuss [mailto:es-discuss-bounces at mozilla.org] On Behalf Of Erik Corry Sent: Tuesday, November 24, 2015 11:22 AM To: Nozomu Katō <noz.ka at akenotsuki.com>
Cc: es-discuss <es-discuss at mozilla.org>
Subject: Re: Backward running version look-behinds
I can speak only for myself. I like the .Net-style lookbehinds, and I hope they will be part of the standard. For something to be in the standard we need both implementations and someone to describe the desired behaviour in the standards document. It looks like implementations are being found. Hopefully someone can write the document so this can move forward.
On Fri, Nov 20, 2015 at 3:41 PM, Nozomu Katō <noz.ka at akenotsuki.com<mailto:noz.ka at akenotsuki.com>> wrote:
I was expecting that ES6 would come with look-behinds, because a proposal had been put at: web.archive.org/web/20121114071428/http://wiki.ecmascript.org/doku.php?id=harmony:proposalsna01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fweb.archive.org%2Fweb%2F20121114071428%2Fhttp%3A%2F%2Fwiki.ecmascript.org%2Fdoku.php%3Fid%3Dharmony%3Aproposals&data=01|01|goyakin%40microsoft.com|f0c0218da10b4637485208d2f4b0b6be|72f988bf86f141af91ab2d7cd011db47|1&sdata=FFRAG66cHvDoLY5a%2FMPcoc6TQMiITnMQCd1q%2Bwhmf2k%3D
However, ES6 does not support them. I noticed that the link to the proposal had been struck-through: web.archive.org/web/20150812143714/http://wiki.ecmascript.org/doku.php?id=harmony:proposalsna01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fweb.archive.org%2Fweb%2F20150812143714%2Fhttp%3A%2F%2Fwiki.ecmascript.org%2Fdoku.php%3Fid%3Dharmony%3Aproposals&data=01|01|goyakin%40microsoft.com|f0c0218da10b4637485208d2f4b0b6be|72f988bf86f141af91ab2d7cd011db47|1&sdata=qMCoxmaDIhT96HvuP9yWYjsjaNpdGD1RDHTsQc%2FoK1Y%3D
I wondered what was a problem. I did research to know the situation about look-behinds, and I found this post: esdiscuss/2013-October/033911na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.mozilla.org%2Fpipermail%2Fes-discuss%2F2013-October%2F033911.html&data=01|01|goyakin%40microsoft.com|f0c0218da10b4637485208d2f4b0b6be|72f988bf86f141af91ab2d7cd011db47|1&sdata=uOhwXM0yPIh%2BOJVA%2FnMcn8m7McxjjOeIDfhN1KPy%2FZQ%3D
I realised that a spec needed to be written by someone, but "someone" had not appeared yet. Thus, I wrote a spec, subscribed to es-discuss, and posted the spec. What made me decide to post that spec was this post and thread.
But now, it turns out that look-behinds similar to the proposal that has been struck-through have been implemented experimentally in Chromium and Gecko. I am confused about the ongoing situation.
I am NOT an objector against .NET-compatible look-behinds. But I wonder if there is someone who writes a spec for them. I have no idea how the behaviours of look-behinds based on the .NET implementation are described in the language used by the ECMAScript spec. Introducing an internal direction switch might be a relatively simple way, but I have no concrete idea even about it.
Nozomu
This is great stuff, thanks for doing this.
I couldn't see any bugs in it, though I must admit that 21.2.2.4 part 4 made my head hurt, so I skipped it.
Just to prove I actually read it, I'll point out that independant is spelled independent,
Thank you for telling us that news. Until any proposal (Claude's, mine, or anyone else's) for look-behind assertions reaches Stage 4, I leave my proposal at that URL.
Nozomu
I am glad to see a spec for .NET-style look-behinds. I am hoping it is considered by those who are familiar with the ECMAScript specification.
As a third option, introducing \K being available in recent versions of Perl might be worth considering. This expression excludes what the preceding expressions matched from $&. So, this can be used as a substitution for a variable-length positive look-behind. perldoc.perl.org/perlre.html#(%3F<%3Dpattern)-\K
The only unresolved problem with this is that a negative corresponding version of it has not been known yet. If we can find it out, these may become the most simple and efficient option.
Nozomu
Can we move this to stage 2?
- We have a spec draft: claudepache.github.io/ecma262/#sec-pattern, claudepache.github.io/ecma262/#sec-pattern
- We have champions: Gorkem Yakin, Nozomu Katō and Brian Terlson
- I'm pretty sure this is going to be part of the standard according to the last TC39 meeting.
- There is a fully functional implementation in the newest V8, though behind a flag at the moment.
- There is another polyfill implementation: dartpad.dartlang.org/630790616c1a568d63b0 (dart, but can be transpiled into JS)
Yang
From: es-discuss [mailto:es-discuss-bounces at mozilla.org] On Behalf Of Yang Guo
- We have a spec draft: claudepache.github.io/ecma262/#sec-pattern, claudepache.github.io/ecma262/#sec-pattern
- We have champions: Gorkem Yakin, Nozomu Katō and Brian Terlson
Hmm, but none of the champions are writing the spec? Probably just want to add Claude to the champion list.
Can we move this to stage 2?
I agree this looks stage 2 worthy. However, the committee has so far exhibited the unfortunate dysfunction of being unable to move things between stages except during face to face meetings.
At this point I would personally suggest the following. Unofficially, consider this to be stage 2; your obligations seem met, even if the committee hasn't had time to convene in meatspeace and bless your work. Now, start looking at what's required for stage 3: being sure the spec text is complete; finding designated reviewers and getting them to sign off; and getting Brian, the editor, to sign off on the spec text. If you can get those additional things together in time for the January meeting, you should be able to use that meeting to jump from stage 1 to stage 3.
At the November TC39 meeting, it was agreed that the new RegExp proposals would advance together, hence it's at stage 0.
As for the spec text, Nozomu has something written up [1], but I haven't had the time to review it yet.
Gorkem
[1] www.akenotsuki.com/misc/srell/en/lookbehinds-spec-l.html
The newly proposed regexp features are mostly self-contained. Does it make sense to bundle them? I'm curious what the benefit of that might be.
Yang
That is one of my ongoing works. I have been studying such an unambiguous spec as:
-
backreference numbers are always assigned to capturing parentheses left-to-right, even in lookbehinds. For example, /(?<=((.)(.))(.))./.exec("abcdef") returns ["d", "ab", "a", "b", "c"].
-
A backrefernce can refer to the corresponding capturing parenthesis if the former occurs to the left of the latter, when both they exist in the same lookbehind.
These are based on the behaviours of .NET's lookbehinds. However, I am unsure if we have already got a spec that works completely, in particular, even when lookarounds, capturing parentheses, and backreferences nest each other.
There is also a compact version, which is similar to Claude's spec: www.akenotsuki.com/misc/srell/en/lookbehinds-spec-c.html
Nozomu
I wonder if the person who wrote the spec for RegExp is on this list. I would like to ask one question: Was there any reason why the following steps were defined in the present order:
21.2.2.4 Alternative The production Alternative :: Alternative Term evaluates as follows:
- Evaluate Alternative to obtain a Matcher m1.
- Evaluate Term to obtain a Matcher m2.
instead of:
The production Alternative :: Term Alternative evaluates as follows:
- Evaluate Term to obtain a Matcher m1.
- Evaluate Alternative to obtain a Matcher m2.
or, was it a matter of preference? If any side effect I am missing exists in the latter order, I need to reconsider or abandon my compact version.
Nozomu
On 12/11/2015 13:16, Nozomu Katō wrote:
I wonder if the person who wrote the spec for RegExp is on this list. I would like to ask one question: Was there any reason why the following steps were defined in the present order:
21.2.2.4 Alternative The production Alternative :: Alternative Term evaluates as follows:
- Evaluate Alternative to obtain a Matcher m1.
- Evaluate Term to obtain a Matcher m2.
instead of:
The production Alternative :: Term Alternative evaluates as follows:
- Evaluate Term to obtain a Matcher m1.
- Evaluate Alternative to obtain a Matcher m2.
or, was it a matter of preference? If any side effect I am missing exists in the latter order, I need to reconsider or abandon my compact version.
Those appear to have equivalent behavior. I just picked one when writing the RegExp spec.
Waldemar
On Wed, 30 Dec 2015 14:56:56 -0800, Waldemar Horwat wrote:
On 12/11/2015 13:16, Nozomu Katō wrote:
I wonder if the person who wrote the spec for RegExp is on this list. I would like to ask one question: Was there any reason why the following steps were defined in the present order:
21.2.2.4 Alternative The production Alternative :: Alternative Term evaluates as follows:
- Evaluate Alternative to obtain a Matcher m1.
- Evaluate Term to obtain a Matcher m2.
instead of:
The production Alternative :: Term Alternative evaluates as follows:
- Evaluate Term to obtain a Matcher m1.
- Evaluate Alternative to obtain a Matcher m2.
or, was it a matter of preference? If any side effect I am missing exists in the latter order, I need to reconsider or abandon my compact version.
Those appear to have equivalent behavior. I just picked one when writing the RegExp spec.
Thank you very much indeed for your comment. Then, my compact version is likely to work expectedly.
But it was first demonstrated by Claude that modifying "21.2.2.4 Alternative" is the simplest way to support lookbehind assertions. So, I think the spec for lookbehinds has basically been prepared by Claude.
Nozomu
I was expecting that ES6 would come with look-behinds, because a proposal had been put at: web.archive.org/web/20121114071428/http://wiki.ecmascript.org/doku.php?id=harmony:proposals
However, ES6 does not support them. I noticed that the link to the proposal had been struck-through: web.archive.org/web/20150812143714/http://wiki.ecmascript.org/doku.php?id=harmony:proposals
I wondered what was a problem. I did research to know the situation about look-behinds, and I found this post: esdiscuss/2013-October/033911
I realised that a spec needed to be written by someone, but "someone" had not appeared yet. Thus, I wrote a spec, subscribed to es-discuss, and posted the spec. What made me decide to post that spec was this post and thread.
But now, it turns out that look-behinds similar to the proposal that has been struck-through have been implemented experimentally in Chromium and Gecko. I am confused about the ongoing situation.
I am NOT an objector against .NET-compatible look-behinds. But I wonder if there is someone who writes a spec for them. I have no idea how the behaviours of look-behinds based on the .NET implementation are described in the language used by the ECMAScript spec. Introducing an internal direction switch might be a relatively simple way, but I have no concrete idea even about it.
Nozomu