Look-behind proposal in trouble

# Nozomu Katō (10 years ago)

Apparently my proposal for adding the look-behind assertions to RegExp has been in trouble. I would like to ask anyone for help.

The following story is what I know about the proposal after my previous post:

I created a pull request for the proposal in July and sent an email to Brendan Eich asking if I can put his name as a champion: tc39/ecma262#48

I have not received a reply to my email, but I received a notification email in September that replying to the pull request, the proposal was moved to stage 0. Today, however, I just noticed that the proposal had been dropped from stage 0, stating "RegExp lookbehind has no champion". tc39/ecma262/commits/master/stage0.md (Oct 4, 2015)

I am uncertain about what happened. Does this mean that Brendan Eich is no longer a champion or did not take a champion on from the beginning or ...?

Apparently my proposal for adding the look-behind assertions to RegExp
has been in trouble. I would like to ask anyone for help.

The following story is what I know about the proposal after my previous
post:

I created a pull request for the proposal in July and sent an email to
Brendan Eich asking if I can put his name as a champion:
https://github.com/tc39/ecma262/pull/48

I have not received a reply to my email, but I received a notification
email in September that replying to the pull request, the proposal was
moved to stage 0. Today, however, I just noticed that the proposal had
been dropped from stage 0, stating "RegExp lookbehind has no champion".
https://github.com/tc39/ecma262/commits/master/stage0.md (Oct 4, 2015)

I am uncertain about what happened. Does this mean that Brendan Eich is
no longer a champion or did not take a champion on from the beginning or
...?


Regards,
  Nozomu

# Brian Terlson (10 years ago)

Brendan has indeed discovered he doesn't have time to champion the proposal through TC39, so I removed it while I searched for a new champion. Good news on that front - I have found one! Gorkem Yakin works on the Chakra team and is available to help move this proposal forward. I will also help out where I can. I've added the proposal back to the stage 0 list!

Hi Nozomu,

Brendan has indeed discovered he doesn't have time to champion the proposal through TC39, so I removed it while I searched for a new champion. Good news on that front - I have found one! Gorkem Yakin works on the Chakra team and is available to help move this proposal forward. I will also help out where I can. I've added the proposal back to the stage 0 list!

Thanks,
Brian


-----Original Message-----
From: es-discuss [mailto:es-discuss-bounces at mozilla.org] On Behalf Of Nozomu Kato
Sent: Sunday, October 4, 2015 4:52 AM
To: es-discuss Mozilla <es-discuss at mozilla.org>
Subject: Look-behind proposal in trouble

Apparently my proposal for adding the look-behind assertions to RegExp has been in trouble. I would like to ask anyone for help.

The following story is what I know about the proposal after my previous
post:

I created a pull request for the proposal in July and sent an email to Brendan Eich asking if I can put his name as a champion:
https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2ftc39%2fecma262%2fpull%2f48&data=01%7c01%7cbrian.terlson%40microsoft.com%7c4ae7856a9bce41643bdf08d2ccb2485d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=OUKMWzLwY0dpLrWaa1ezy352chL5U46e%2bndukc41oBY%3d

I have not received a reply to my email, but I received a notification email in September that replying to the pull request, the proposal was moved to stage 0. Today, however, I just noticed that the proposal had been dropped from stage 0, stating "RegExp lookbehind has no champion".
https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2ftc39%2fecma262%2fcommits%2fmaster%2fstage0.md&data=01%7c01%7cbrian.terlson%40microsoft.com%7c4ae7856a9bce41643bdf08d2ccb2485d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=p8ueBFdyJcbB0cttDVQLU3OU0QoKbe5ddCl4C5tfO8E%3d (Oct 4, 2015)

I am uncertain about what happened. Does this mean that Brendan Eich is no longer a champion or did not take a champion on from the beginning or ...?


Regards,
  Nozomu
_______________________________________________
es-discuss mailing list
es-discuss at mozilla.org
https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.mozilla.org%2flistinfo%2fes-discuss&data=01%7c01%7cbrian.terlson%40microsoft.com%7c4ae7856a9bce41643bdf08d2ccb2485d%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=sxponYvs%2b0KYtzLp%2bV4WO2qnAudxZB7x5bkGOCRanT8%3d

# Nozomu Katō (10 years ago)

I thank you very much indeed for your email and bringing really good news! I thought that my proposal might not be able to move forward anymore.

I am also thankful that you searched for a new champion and Gorkem undertakes this proposal!

Hello Brian,

I thank you very much indeed for your email and bringing really good
news! I thought that my proposal might not be able to move forward
anymore.

I am also thankful that you searched for a new champion and Gorkem
undertakes this proposal!

Regards,
  Nozomu


Brian Terlson wrote on Mon, 5 Oct 2015, at 20:29:18 +0000:
> Hi Nozomu,
> 
> Brendan has indeed discovered he doesn't have time to champion the
> proposal through TC39, so I removed it while I searched for a new
> champion. Good news on that front - I have found one! Gorkem Yakin
> works on the Chakra team and is available to help move this proposal
> forward. I will also help out where I can. I've added the proposal
> back to the stage 0 list!
> 
> Thanks,
> Brian

# Sebastian Zartner (10 years ago)

Brian, where can people get the information about the reasons of such decisions (besides asking) and more generally about the processes behind the ES development?

I was following Nozomu's proposal closely, though to me it looked like the progress on this just died out.

Non-the-less, great to hear that new champions could be found!

Hi together,

Brian, where can people get the information about the reasons of such
decisions (besides asking) and more generally about the processes behind
the ES development?

I was following Nozomu's proposal[1] closely, though to me it looked like
the progress on this just died out.

Non-the-less, great to hear that new champions could be found!

Sebastian

[1] https://mail.mozilla.org/pipermail/es-discuss/2015-May/042910.html

On 5 October 2015 at 23:42, Nozomu Katō <noz.ka at akenotsuki.com> wrote:

> Hello Brian,
>
> I thank you very much indeed for your email and bringing really good
> news! I thought that my proposal might not be able to move forward
> anymore.
>
> I am also thankful that you searched for a new champion and Gorkem
> undertakes this proposal!
>
> Regards,
>   Nozomu
>
>
> Brian Terlson wrote on Mon, 5 Oct 2015, at 20:29:18 +0000:
> > Hi Nozomu,
> >
> > Brendan has indeed discovered he doesn't have time to champion the
> > proposal through TC39, so I removed it while I searched for a new
> > champion. Good news on that front - I have found one! Gorkem Yakin
> > works on the Chakra team and is available to help move this proposal
> > forward. I will also help out where I can. I've added the proposal
> > back to the stage 0 list!
> >
> > Thanks,
> > Brian
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151006/1aafc206/attachment-0001.html>

# Erik Corry (10 years ago)

Your proposal for look-behind relies on being able to count the match length of the look-behind in order to step back that far. This presupposes that atoms like . and character classes have a fixed length.

However, with the /u flag, the . and some character classes can be either 1 or two code units. This means you don't know how far to step back. This needs to be fixed in a way that is not incompatible with the "correct" .NET way of doing things.

Eg matching /a.(?<!x..)/u against "xa😹" (x, a, cat-face-with-tears-of-joy, which is a surrogate pair). The back reference has an apparent width of 3, so we step back 3 code units, but that hits the 'a', not the 'x' and so the back reference fails to spot the 'x'.

Your proposal for look-behind relies on being able to count the match
length of the look-behind in order to step back that far.  This presupposes
that atoms like . and character classes have a fixed length.

However, with the /u flag, the . and some character classes can be either 1
or two code units.  This means you don't know how far to step back.  This
needs to be fixed in a way that is not incompatible with the "correct" .NET
way of doing things.

Eg matching /a.(?<!x..)/ against "xa😹"  (x, a, cat-face-with-tears-of-joy,
which is a surrogate pair).  The back reference has an apparent width of 3,
so we step back 3 code units, but that hits the 'a', not the 'x' and so the
back reference fails to spot the 'x'.

On Sun, Oct 4, 2015 at 1:52 PM, Nozomu Katō <noz.ka at akenotsuki.com> wrote:

> Apparently my proposal for adding the look-behind assertions to RegExp
> has been in trouble. I would like to ask anyone for help.
>
> The following story is what I know about the proposal after my previous
> post:
>
> I created a pull request for the proposal in July and sent an email to
> Brendan Eich asking if I can put his name as a champion:
> https://github.com/tc39/ecma262/pull/48
>
> I have not received a reply to my email, but I received a notification
> email in September that replying to the pull request, the proposal was
> moved to stage 0. Today, however, I just noticed that the proposal had
> been dropped from stage 0, stating "RegExp lookbehind has no champion".
> https://github.com/tc39/ecma262/commits/master/stage0.md (Oct 4, 2015)
>
> I am uncertain about what happened. Does this mean that Brendan Eich is
> no longer a champion or did not take a champion on from the beginning or
> ...?
>
>
> Regards,
>   Nozomu
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/a7758c2b/attachment-0001.html>

# Erik Corry (10 years ago)

(edit made to previous post)

Oops forgot the /u on the regexp in the example.

On Wed, Oct 7, 2015 at 10:06 AM, Erik Corry <erik.corry at gmail.com> wrote:

> Your proposal for look-behind relies on being able to count the match
> length of the look-behind in order to step back that far.  This presupposes
> that atoms like . and character classes have a fixed length.
>
> However, with the /u flag, the . and some character classes can be either
> 1 or two code units.  This means you don't know how far to step back.  This
> needs to be fixed in a way that is not incompatible with the "correct" .NET
> way of doing things.
>
> Eg matching /a.(?<!x..)/ against "xa😹"  (x, a,
> cat-face-with-tears-of-joy, which is a surrogate pair).  The back reference
> has an apparent width of 3, so we step back 3 code units, but that hits the
> 'a', not the 'x' and so the back reference fails to spot the 'x'.
>
>
> On Sun, Oct 4, 2015 at 1:52 PM, Nozomu Katō <noz.ka at akenotsuki.com> wrote:
>
>> Apparently my proposal for adding the look-behind assertions to RegExp
>> has been in trouble. I would like to ask anyone for help.
>>
>> The following story is what I know about the proposal after my previous
>> post:
>>
>> I created a pull request for the proposal in July and sent an email to
>> Brendan Eich asking if I can put his name as a champion:
>> https://github.com/tc39/ecma262/pull/48
>>
>> I have not received a reply to my email, but I received a notification
>> email in September that replying to the pull request, the proposal was
>> moved to stage 0. Today, however, I just noticed that the proposal had
>> been dropped from stage 0, stating "RegExp lookbehind has no champion".
>> https://github.com/tc39/ecma262/commits/master/stage0.md (Oct 4, 2015)
>>
>> I am uncertain about what happened. Does this mean that Brendan Eich is
>> no longer a champion or did not take a champion on from the beginning or
>> ...?
>>
>>
>> Regards,
>>   Nozomu
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/d9b8129b/attachment.html>

# Claude Pache (10 years ago)

This should not be a problem: With the /u flag, you work with code points, not code units. In particular, the . matches always a sequence (of code points with /u, or code units otherwise) of length 1.

This should not be a problem: With the /u flag, you work with code points, not code units. In particular, the `.` matches always a sequence (of code points with /u, or code units otherwise) of length 1.

—Claude


> Le 7 oct. 2015 à 10:08, Erik Corry <erik.corry at gmail.com> a écrit :
> 
> Oops forgot the /u on the regexp in the example.
> 
> On Wed, Oct 7, 2015 at 10:06 AM, Erik Corry <erik.corry at gmail.com <mailto:erik.corry at gmail.com>> wrote:
> Your proposal for look-behind relies on being able to count the match length of the look-behind in order to step back that far.  This presupposes that atoms like . and character classes have a fixed length.
> 
> However, with the /u flag, the . and some character classes can be either 1 or two code units.  This means you don't know how far to step back.  This needs to be fixed in a way that is not incompatible with the "correct" .NET way of doing things.
> 
> Eg matching /a.(?<!x..)/ against "xa😹"  (x, a, cat-face-with-tears-of-joy, which is a surrogate pair).  The back reference has an apparent width of 3, so we step back 3 code units, but that hits the 'a', not the 'x' and so the back reference fails to spot the 'x'. 
> 
> 
> On Sun, Oct 4, 2015 at 1:52 PM, Nozomu Katō <noz.ka at akenotsuki.com <mailto:noz.ka at akenotsuki.com>> wrote:
> Apparently my proposal for adding the look-behind assertions to RegExp
> has been in trouble. I would like to ask anyone for help.
> 
> The following story is what I know about the proposal after my previous
> post:
> 
> I created a pull request for the proposal in July and sent an email to
> Brendan Eich asking if I can put his name as a champion:
> https://github.com/tc39/ecma262/pull/48 <https://github.com/tc39/ecma262/pull/48>
> 
> I have not received a reply to my email, but I received a notification
> email in September that replying to the pull request, the proposal was
> moved to stage 0. Today, however, I just noticed that the proposal had
> been dropped from stage 0, stating "RegExp lookbehind has no champion".
> https://github.com/tc39/ecma262/commits/master/stage0.md <https://github.com/tc39/ecma262/commits/master/stage0.md> (Oct 4, 2015)
> 
> I am uncertain about what happened. Does this mean that Brendan Eich is
> no longer a champion or did not take a champion on from the beginning or
> ...?
> 
> 
> Regards,
>   Nozomu
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org <mailto:es-discuss at mozilla.org>
> https://mail.mozilla.org/listinfo/es-discuss <https://mail.mozilla.org/listinfo/es-discuss>
> 
> 
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/8a0f2b5a/attachment.html>

# Erik Corry (10 years ago)

The proposal needs to be clarified to explain that you are stepping back a number of code points, not units. This implies that you are inspecting the input string as you step backwards. Also it should be explained what to do if there are unpaired surrogates in the input string and inside the lookbehind expression source.

I think the proposal would benefit from a pointer to an implementation or two. Of course the implementations should also fully support /u.

The proposal needs to be clarified to explain that you are stepping back a
number of code points, not units.  This implies that you are inspecting the
input string as you step backwards.  Also it should be explained what to do
if there are unpaired surrogates in the input string and inside the
lookbehind expression source.

I think the proposal would benefit from a pointer to an implementation or
two.  Of course the implementations should also fully support /u.

On Wed, Oct 7, 2015 at 11:10 AM, Claude Pache <claude.pache at gmail.com>
wrote:

> This should not be a problem: With the /u flag, you work with code points,
> not code units. In particular, the `.` matches always a sequence (of code
> points with /u, or code units otherwise) of length 1.
>
> —Claude
>
>
>
> Le 7 oct. 2015 à 10:08, Erik Corry <erik.corry at gmail.com> a écrit :
>
> Oops forgot the /u on the regexp in the example.
>
> On Wed, Oct 7, 2015 at 10:06 AM, Erik Corry <erik.corry at gmail.com> wrote:
>
>> Your proposal for look-behind relies on being able to count the match
>> length of the look-behind in order to step back that far.  This presupposes
>> that atoms like . and character classes have a fixed length.
>>
>> However, with the /u flag, the . and some character classes can be either
>> 1 or two code units.  This means you don't know how far to step back.  This
>> needs to be fixed in a way that is not incompatible with the "correct" .NET
>> way of doing things.
>>
>> Eg matching /a.(?<!x..)/ against "xa😹"  (x, a,
>> cat-face-with-tears-of-joy, which is a surrogate pair).  The back reference
>> has an apparent width of 3, so we step back 3 code units, but that hits the
>> 'a', not the 'x' and so the back reference fails to spot the 'x'.
>>
>>
>> On Sun, Oct 4, 2015 at 1:52 PM, Nozomu Katō <noz.ka at akenotsuki.com>
>> wrote:
>>
>>> Apparently my proposal for adding the look-behind assertions to RegExp
>>> has been in trouble. I would like to ask anyone for help.
>>>
>>> The following story is what I know about the proposal after my previous
>>> post:
>>>
>>> I created a pull request for the proposal in July and sent an email to
>>> Brendan Eich asking if I can put his name as a champion:
>>> https://github.com/tc39/ecma262/pull/48
>>>
>>> I have not received a reply to my email, but I received a notification
>>> email in September that replying to the pull request, the proposal was
>>> moved to stage 0. Today, however, I just noticed that the proposal had
>>> been dropped from stage 0, stating "RegExp lookbehind has no champion".
>>> https://github.com/tc39/ecma262/commits/master/stage0.md (Oct 4, 2015)
>>>
>>> I am uncertain about what happened. Does this mean that Brendan Eich is
>>> no longer a champion or did not take a champion on from the beginning or
>>> ...?
>>>
>>>
>>> Regards,
>>>   Nozomu
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>
>>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/1cae502a/attachment-0001.html>

# Claude Pache (10 years ago)

Le 7 oct. 2015 à 11:16, Erik Corry <erik.corry at gmail.com> a écrit :

The proposal needs to be clarified to explain that you are stepping back a number of code points, not units. This implies that you are inspecting the input string as you step backwards. Also it should be explained what to do if there are unpaired surrogates in the input string and inside the lookbehind expression source.

Looking at the proposal, there is a Note section (recently added) clarifying that point if needed.

The way of counting, and the meaning of the words "character", "code point" and "code unit", are the same as in ES2015; there is really nothing new here. See 2 for details. If anything needs to be clarified e.g. regarding unpaired surrogates, it is not specific to lookbehind, but applies to the whole regexp semantics.

> Le 7 oct. 2015 à 11:16, Erik Corry <erik.corry at gmail.com> a écrit :
> 
> The proposal needs to be clarified to explain that you are stepping back a number of code points, not units.  This implies that you are inspecting the input string as you step backwards.  Also it should be explained what to do if there are unpaired surrogates in the input string and inside the lookbehind expression source.

Looking at the proposal [1], there is a Note section (recently added) clarifying that point if needed. 

The way of counting, and the meaning of the words "character", "code point" and "code unit", are the same as in ES2015; there is really nothing new here. See [2] for details. If anything needs to be clarified e.g. regarding unpaired surrogates, it is not specific to lookbehind, but applies to the whole regexp semantics.

—Claude

[1] http://www.akenotsuki.com/misc/srell/lookbehind_proposal.html <http://www.akenotsuki.com/misc/srell/lookbehind_proposal.html>
[2] http://www.ecma-international.org/ecma-262/6.0/#sec-pattern-semantics <http://www.ecma-international.org/ecma-262/6.0/#sec-pattern-semantics>


> 
> I think the proposal would benefit from a pointer to an implementation or two.  Of course the implementations should also fully support /u.
> 
> On Wed, Oct 7, 2015 at 11:10 AM, Claude Pache <claude.pache at gmail.com <mailto:claude.pache at gmail.com>> wrote:
> This should not be a problem: With the /u flag, you work with code points, not code units. In particular, the `.` matches always a sequence (of code points with /u, or code units otherwise) of length 1.
> 
> —Claude
> 
> 
> 
>> Le 7 oct. 2015 à 10:08, Erik Corry <erik.corry at gmail.com <mailto:erik.corry at gmail.com>> a écrit :
>> 
>> Oops forgot the /u on the regexp in the example.
>> 
>> On Wed, Oct 7, 2015 at 10:06 AM, Erik Corry <erik.corry at gmail.com <mailto:erik.corry at gmail.com>> wrote:
>> Your proposal for look-behind relies on being able to count the match length of the look-behind in order to step back that far.  This presupposes that atoms like . and character classes have a fixed length.
>> 
>> However, with the /u flag, the . and some character classes can be either 1 or two code units.  This means you don't know how far to step back.  This needs to be fixed in a way that is not incompatible with the "correct" .NET way of doing things.
>> 
>> Eg matching /a.(?<!x..)/ against "xa😹"  (x, a, cat-face-with-tears-of-joy, which is a surrogate pair).  The back reference has an apparent width of 3, so we step back 3 code units, but that hits the 'a', not the 'x' and so the back reference fails to spot the 'x'. 
>> 
>> 
>> On Sun, Oct 4, 2015 at 1:52 PM, Nozomu Katō <noz.ka at akenotsuki.com <mailto:noz.ka at akenotsuki.com>> wrote:
>> Apparently my proposal for adding the look-behind assertions to RegExp
>> has been in trouble. I would like to ask anyone for help.
>> 
>> The following story is what I know about the proposal after my previous
>> post:
>> 
>> I created a pull request for the proposal in July and sent an email to
>> Brendan Eich asking if I can put his name as a champion:
>> https://github.com/tc39/ecma262/pull/48 <https://github.com/tc39/ecma262/pull/48>
>> 
>> I have not received a reply to my email, but I received a notification
>> email in September that replying to the pull request, the proposal was
>> moved to stage 0. Today, however, I just noticed that the proposal had
>> been dropped from stage 0, stating "RegExp lookbehind has no champion".
>> https://github.com/tc39/ecma262/commits/master/stage0.md <https://github.com/tc39/ecma262/commits/master/stage0.md> (Oct 4, 2015)
>> 
>> I am uncertain about what happened. Does this mean that Brendan Eich is
>> no longer a champion or did not take a champion on from the beginning or
>> ...?
>> 
>> 
>> Regards,
>>   Nozomu
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org <mailto:es-discuss at mozilla.org>
>> https://mail.mozilla.org/listinfo/es-discuss <https://mail.mozilla.org/listinfo/es-discuss>
>> 
>> 
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org <mailto:es-discuss at mozilla.org>
>> https://mail.mozilla.org/listinfo/es-discuss <https://mail.mozilla.org/listinfo/es-discuss>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/dbbe997b/attachment.html>

# Nozomu Katō (10 years ago)

What Claude mentioned is already part of the specification: "Input is a List consisting of all of the characters" and "Each character is either a code unit or a code point, depending upon the kind of pattern involved" (21.2.2.1).

But I added the Note section to the page of my proposal for clarification two days ago because I was asked a similar question.

Incidentally, in the initial version of the proposal I used the term "code point" but later changed it to "character" since Allen pointed out: esdiscuss/2015-May/042922

What Claude mentioned is already part of the specification: "Input is a
List consisting of all of the characters" and "Each character is either
a code unit or a code point, depending upon the kind of pattern
involved" (21.2.2.1).

But I added the Note section to the page of my proposal for
clarification two days ago because I was asked a similar question.

Incidentally, in the initial version of the proposal I used the term
"code point" but later changed it to "character" since Allen pointed
out:
https://mail.mozilla.org/pipermail/es-discuss/2015-May/042922.html

Regards,
  Nozomu

Erik Corry wrote on Wed, 7 Oct 2015, at 11:16:54 +0200:
> The proposal needs to be clarified to explain that you are stepping back a
> number of code points, not units.  This implies that you are inspecting the
> input string as you step backwards.  Also it should be explained what to do
> if there are unpaired surrogates in the input string and inside the
> lookbehind expression source.
> 
> I think the proposal would benefit from a pointer to an implementation or
> two.  Of course the implementations should also fully support /u.
> 
> On Wed, Oct 7, 2015 at 11:10 AM, Claude Pache
> wrote:
> 
>> This should not be a problem: With the /u flag, you work with code points,
>> not code units. In particular, the `.` matches always a sequence (of code
>> points with /u, or code units otherwise) of length 1.
>>
>> —Claude

# Brian Terlson (10 years ago)

From: Sebastian Zartner [mailto:sebastianzartner at gmail.com]

where can people get the information about the reasons of such decisions (besides asking) and more generally about the processes behind the ES development?

You can follow the tc39/ecma262 github repository for updates on proposals. It also contains information about our process.

Sebastian,

You can follow the tc39/ecma262 github repository for updates on proposals. It also contains information about our process.

From: Sebastian Zartner [mailto:sebastianzartner at gmail.com]
Sent: Monday, October 5, 2015 10:56 PM
To: Nozomu Katō <noz.ka at akenotsuki.com>
Cc: Brian Terlson <Brian.Terlson at microsoft.com>; es-discuss Mozilla <es-discuss at mozilla.org>; Gorkem Yakin <goyakin at microsoft.com>
Subject: Re: Look-behind proposal in trouble

Hi together,
Brian, where can people get the information about the reasons of such decisions (besides asking) and more generally about the processes behind the ES development?
I was following Nozomu's proposal[1] closely, though to me it looked like the progress on this just died out.
Non-the-less, great to hear that new champions could be found!

Sebastian

[1] https://mail.mozilla.org/pipermail/es-discuss/2015-May/042910.html<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.mozilla.org%2fpipermail%2fes-discuss%2f2015-May%2f042910.html&data=01%7c01%7cBrian.Terlson%40microsoft.com%7c9a8b71c0d5aa44e2cc5608d2ce12eaa6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=L42fz4DoEDGTJTX0Tkz4OvQU03XONPnxMcjsH5zOT1Q%3d>

On 5 October 2015 at 23:42, Nozomu Katō <noz.ka at akenotsuki.com<mailto:noz.ka at akenotsuki.com>> wrote:
Hello Brian,

I thank you very much indeed for your email and bringing really good
news! I thought that my proposal might not be able to move forward
anymore.

I am also thankful that you searched for a new champion and Gorkem
undertakes this proposal!

Regards,
  Nozomu

Brian Terlson wrote on Mon, 5 Oct 2015, at 20:29:18 +0000:
> Hi Nozomu,
>
> Brendan has indeed discovered he doesn't have time to champion the
> proposal through TC39, so I removed it while I searched for a new
> champion. Good news on that front - I have found one! Gorkem Yakin
> works on the Chakra team and is available to help move this proposal
> forward. I will also help out where I can. I've added the proposal
> back to the stage 0 list!
>
> Thanks,
> Brian
_______________________________________________
es-discuss mailing list
es-discuss at mozilla.org<mailto:es-discuss at mozilla.org>
https://mail.mozilla.org/listinfo/es-discuss<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.mozilla.org%2flistinfo%2fes-discuss&data=01%7c01%7cBrian.Terlson%40microsoft.com%7c9a8b71c0d5aa44e2cc5608d2ce12eaa6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=CkApvLPKtBkJEMHhTi7fnCpbZXpZi1R3vvJaca0kxDY%3d>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151007/c30566b9/attachment-0001.html>

# Erik Corry (10 years ago)

I made an implementation of .NET-style variable length lookbehinds. It's not in a JS engine, but it's in a very simple (and very slow) ES5-compatible regexp engine that is used in the tiny Dart implementation named Fletch.

No unicode issues arise since this engine does not support /u, but I don't expect any issues since it's not trying to second-guess the length of the string matched by an expression.

Needs a lot more tests, but it seems to work OK and was surprisingly simple to do. Basically:

All steps in the input string are reversed, so if you would step forwards you step backwards.
Check for start of string instead of end of string.
Test against the character to the left of the cursor instead of to the right.
The parts of the Alternative (see the regexp grammar in the standard) are code-generated in reverse order.

Code is here: codereview.chromium.org/1398033002

I made an implementation of .NET-style variable length lookbehinds.  It's
not in a JS engine, but it's in a very simple (and very slow)
ES5-compatible regexp engine that is used in the tiny Dart implementation
named Fletch.

No unicode issues arise since this engine does not support /u, but I don't
expect any issues since it's not trying to second-guess the length of  the
string matched by an expression.

Needs a lot more tests, but it seems to work OK and was surprisingly simple
to do.  Basically:

* All steps in the input string are reversed, so if you would step forwards
you step backwards.
* Check for start of string instead of end of string.
* Test against the character to the left of the cursor instead of to the
right.
* The parts of the Alternative (see the regexp grammar in the standard) are
code-generated in reverse order.

Code is here: https://codereview.chromium.org/1398033002/


On Wed, Oct 7, 2015 at 9:08 PM, Brian Terlson <Brian.Terlson at microsoft.com>
wrote:

> Sebastian,
>
>
>
> You can follow the tc39/ecma262 github repository for updates on
> proposals. It also contains information about our process.
>
>
>
> *From:* Sebastian Zartner [mailto:sebastianzartner at gmail.com]
> *Sent:* Monday, October 5, 2015 10:56 PM
> *To:* Nozomu Katō <noz.ka at akenotsuki.com>
> *Cc:* Brian Terlson <Brian.Terlson at microsoft.com>; es-discuss Mozilla <
> es-discuss at mozilla.org>; Gorkem Yakin <goyakin at microsoft.com>
> *Subject:* Re: Look-behind proposal in trouble
>
>
>
> Hi together,
>
> Brian, where can people get the information about the reasons of such
> decisions (besides asking) and more generally about the processes behind
> the ES development?
>
> I was following Nozomu's proposal[1] closely, though to me it looked like
> the progress on this just died out.
>
> Non-the-less, great to hear that new champions could be found!
>
>
>
> Sebastian
>
> [1] https://mail.mozilla.org/pipermail/es-discuss/2015-May/042910.html
> <https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.mozilla.org%2fpipermail%2fes-discuss%2f2015-May%2f042910.html&data=01%7c01%7cBrian.Terlson%40microsoft.com%7c9a8b71c0d5aa44e2cc5608d2ce12eaa6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=L42fz4DoEDGTJTX0Tkz4OvQU03XONPnxMcjsH5zOT1Q%3d>
>
>
>
> On 5 October 2015 at 23:42, Nozomu Katō <noz.ka at akenotsuki.com> wrote:
>
> Hello Brian,
>
> I thank you very much indeed for your email and bringing really good
> news! I thought that my proposal might not be able to move forward
> anymore.
>
> I am also thankful that you searched for a new champion and Gorkem
> undertakes this proposal!
>
> Regards,
>   Nozomu
>
>
> Brian Terlson wrote on Mon, 5 Oct 2015, at 20:29:18 +0000:
> > Hi Nozomu,
> >
> > Brendan has indeed discovered he doesn't have time to champion the
> > proposal through TC39, so I removed it while I searched for a new
> > champion. Good news on that front - I have found one! Gorkem Yakin
> > works on the Chakra team and is available to help move this proposal
> > forward. I will also help out where I can. I've added the proposal
> > back to the stage 0 list!
> >
> > Thanks,
> > Brian
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
> <https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fmail.mozilla.org%2flistinfo%2fes-discuss&data=01%7c01%7cBrian.Terlson%40microsoft.com%7c9a8b71c0d5aa44e2cc5608d2ce12eaa6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=CkApvLPKtBkJEMHhTi7fnCpbZXpZi1R3vvJaca0kxDY%3d>
>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151009/c0432033/attachment.html>

# Nozomu Katō (10 years ago)

Me too; I have once implemented lookbehind assertions by using this way in SRELL, my C++ template library whose engine is compatible with RegExp of ECMAScript but whose class design is compatible with std::regex of C++ [1].

However, later I removed the code for such lookbehinds and adopted Perl5 style lookbehinds instead. The core reasons are:

Right-to-left matchers are used only in lookbehind assertions;
Nevertheless, these cannot share code with normal (left-to-right) matchers and need their own optimization processes.

Thus, I came to feel that what I can get and what I have to do are unbalanced.

In my understanding, features that are available in .NET style lookbehinds but are not so and even cannot be emulated in Perl5 style lookbehinds are 1) the use of the backreference and 2) the use of the quantifiers other than {n}. The others can be emulated in some way.

For example, the positive multiple-length lookbehind (?<=ab|cde) can be substituted by (?:(?<=ab)|(?<=cde)). The substitution of the negative multiple-length lookbehind is more simple, only to write assertions in succession; for example, (?<!ab|cde) can be written as (?<!ab)(?<!cde).

I guess that oniguruma supports expressions like (?<=ab|cde) by doing such substitutions inside the library, but just my guess.

So, I came to feel that Perl5 style lookbehinds are balanced. It may not be best, though. In fact, the current implementation for lookbehinds in my library is far simple; it shares code with lookaheads. If the count to rewind is 0 then it means lookahead, otherwise (if equal to or more than 1) it means lookbehind.

If we would introduce .NET style lookbehinds into RegExp of ECMAScript, it would need someone who writes right-to-left versions of the most parts of the definitions under 21.2 of the specification.

Nozomu

[1] www.akenotsuki.com/misc/srell/en

Erik Corry wrote on Fri, 9 Oct 2015, at 10:52:09 +0200:
> I made an implementation of .NET-style variable length lookbehinds.  It's
> not in a JS engine, but it's in a very simple (and very slow)
> ES5-compatible regexp engine that is used in the tiny Dart implementation
> named Fletch.
> 
> No unicode issues arise since this engine does not support /u, but I don't
> expect any issues since it's not trying to second-guess the length of  the
> string matched by an expression.
> 
> Needs a lot more tests, but it seems to work OK and was surprisingly simple
> to do.  Basically:
> 
> * All steps in the input string are reversed, so if you would step forwards
> you step backwards.
> * Check for start of string instead of end of string.
> * Test against the character to the left of the cursor instead of to the
> right.
> * The parts of the Alternative (see the regexp grammar in the standard) are
> code-generated in reverse order.
> 
> Code is here: https://codereview.chromium.org/1398033002/

Me too; I have once implemented lookbehind assertions by using this way
in SRELL, my C++ template library whose engine is compatible with RegExp
of ECMAScript but whose class design is compatible with std::regex of
C++ [1].

However, later I removed the code for such lookbehinds and adopted Perl5
style lookbehinds instead. The core reasons are:

1. Right-to-left matchers are used only in lookbehind assertions;
2. Nevertheless, these cannot share code with normal (left-to-right)
   matchers and need their own optimization processes.

Thus, I came to feel that what I can get and what I have to do are
unbalanced.

In my understanding, features that are available in .NET style
lookbehinds but are not so and even cannot be emulated in Perl5 style
lookbehinds are 1) the use of the backreference and 2) the use of the
quantifiers other than {n}. The others can be emulated in some way.

For example, the positive multiple-length lookbehind (?<=ab|cde) can be
substituted by (?:(?<=ab)|(?<=cde)). The substitution of the negative
multiple-length lookbehind is more simple, only to write assertions in
succession; for example, (?<!ab|cde) can be written as (?<!ab)(?<!cde).

I guess that oniguruma supports expressions like (?<=ab|cde) by doing
such substitutions inside the library, but just my guess.

So, I came to feel that Perl5 style lookbehinds are balanced. It may not
be best, though. In fact, the current implementation for lookbehinds in
my library is far simple; it shares code with lookaheads. If the count
to rewind is 0 then it means lookahead, otherwise (if equal to or more
than 1) it means lookbehind.

If we would introduce .NET style lookbehinds into RegExp of ECMAScript,
it would need someone who writes right-to-left versions of the most
parts of the definitions under 21.2 of the specification.

Nozomu

[1] http://www.akenotsuki.com/misc/srell/en/

# Claude Pache (10 years ago)

Note that full-featured lookbehind assertions (à la .NET) is not the only case where backward matching is useful. Consider for instance, the following simple method:

String.prototype.trimRight = function () {
    return this.replace(/\s+$/u, '') 
}

That implementation would be more efficient if we could instruct the regexp to be applied backwards.

> Le 9 oct. 2015 à 15:00, Nozomu Katō <noz.ka at akenotsuki.com> a écrit :
> 
> Erik Corry wrote on Fri, 9 Oct 2015, at 10:52:09 +0200:
>> I made an implementation of .NET-style variable length lookbehinds.  It's
>> not in a JS engine, but it's in a very simple (and very slow)
>> ES5-compatible regexp engine that is used in the tiny Dart implementation
>> named Fletch.
>> 
>> No unicode issues arise since this engine does not support /u, but I don't
>> expect any issues since it's not trying to second-guess the length of  the
>> string matched by an expression.
>> 
>> Needs a lot more tests, but it seems to work OK and was surprisingly simple
>> to do.  Basically:
>> 
>> * All steps in the input string are reversed, so if you would step forwards
>> you step backwards.
>> * Check for start of string instead of end of string.
>> * Test against the character to the left of the cursor instead of to the
>> right.
>> * The parts of the Alternative (see the regexp grammar in the standard) are
>> code-generated in reverse order.
>> 
>> Code is here: https://codereview.chromium.org/1398033002/
> 
> Me too; I have once implemented lookbehind assertions by using this way
> in SRELL, my C++ template library whose engine is compatible with RegExp
> of ECMAScript but whose class design is compatible with std::regex of
> C++ [1].
> 
> However, later I removed the code for such lookbehinds and adopted Perl5
> style lookbehinds instead. The core reasons are:
> 
> 1. Right-to-left matchers are used only in lookbehind assertions;
> 2. Nevertheless, these cannot share code with normal (left-to-right)
>   matchers and need their own optimization processes.
> 
> Thus, I came to feel that what I can get and what I have to do are
> unbalanced.
> 
> In my understanding, features that are available in .NET style
> lookbehinds but are not so and even cannot be emulated in Perl5 style
> lookbehinds are 1) the use of the backreference and 2) the use of the
> quantifiers other than {n}. The others can be emulated in some way.
> 
> For example, the positive multiple-length lookbehind (?<=ab|cde) can be
> substituted by (?:(?<=ab)|(?<=cde)). The substitution of the negative
> multiple-length lookbehind is more simple, only to write assertions in
> succession; for example, (?<!ab|cde) can be written as (?<!ab)(?<!cde).
> 
> I guess that oniguruma supports expressions like (?<=ab|cde) by doing
> such substitutions inside the library, but just my guess.
> 
> So, I came to feel that Perl5 style lookbehinds are balanced. It may not
> be best, though. In fact, the current implementation for lookbehinds in
> my library is far simple; it shares code with lookaheads. If the count
> to rewind is 0 then it means lookahead, otherwise (if equal to or more
> than 1) it means lookbehind.
> 
> If we would introduce .NET style lookbehinds into RegExp of ECMAScript,
> it would need someone who writes right-to-left versions of the most
> parts of the definitions under 21.2 of the specification.
> 
> Nozomu
> 
> [1] http://www.akenotsuki.com/misc/srell/en/ <http://www.akenotsuki.com/misc/srell/en/>
Note that full-featured lookbehind assertions (à la .NET) is not the only case where backward matching is useful. 
Consider for instance, the following simple method:

```js
String.prototype.trimRight = function () {
    return this.replace(/\s+$/u, '') 
}
```

That implementation would be more efficient if we could instruct the regexp to be applied backwards.

—Claude


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151009/c2ecb375/attachment.html>

# Erik Corry (10 years ago)

I'm not convinced that the current proposal is easier to implement than the real thing. Take a look at the patch, it's trivial.

The lack of variable length lookbehind is a big annoyance in most languages. Search for the term and you'll find lots of frustrated perl users.

On the other hand I don't think adding variable length lookbehind to the spec makes it any easier to optimize /.+$/.

I'm not convinced that the current proposal is easier to implement than the
real thing.  Take a look at the patch, it's trivial.

The lack of variable length lookbehind is a big annoyance in most
languages.  Search for the term and you'll find lots of frustrated perl
users.

On the other hand I don't think adding variable length lookbehind to the
spec makes it any easier to optimize /.+$/.

-- 
Erik Corry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151009/62339f01/attachment.html>

# Nozomu Katō (10 years ago)

Since there was a comment about Perl5 style vs .NET style when I first posted my proposal to es-discuss, too, I just wanted to explain about the background of my proposal. I proposed Perl5 compatible lookbehinds because I thought it was relatively simple to implement. Moreover, I am not confident that I can write a lookbehind proposal based on .NET implementation, in the manner used in the ECMAScript specification.

As Jason Orendorff wrote before, the lookbehind supported by .NET is a strict superset of what I have proposed. So, if you or someone else submits another lookbehind proposal based on .NET and it supersedes my proposal in a later version of ECMAScript, from the point of view of users, that would look like just an enhancement.

My hope is that lookbehind assertions are certainly supported in RegExp of ECMAScript in the near future.

Since there was a comment about Perl5 style vs .NET style when I first
posted my proposal to es-discuss, I just wanted to explain about the
background of my proposal. I proposed Perl5 compatible lookbehinds
because I thought it was relatively simple to implement. Moreover, I am
not confident that I can write a lookbehind proposal based on .NET
implementation, in the manner used in the ECMAScript specification.

As Jason Orendorff wrote before, the lookbehind supported by .NET is a
strict superset of what I have proposed. So, if you or someone else
submits another lookbehind proposal based on .NET and it supersedes my
proposal in a later version of ECMAScript, from the point of view of
users, that would look like just an enhancement.

My hope is that lookbehind assertions are certainly supported in RegExp
of ECMAScript in the near future.

Regards,
  Nozomu

Erik Corry wrote on Fri, 9 Oct 2015, at 20:54:57 +0200:
> I'm not convinced that the current proposal is easier to implement than the
> real thing.  Take a look at the patch, it's trivial.
> 
> The lack of variable length lookbehind is a big annoyance in most
> languages.  Search for the term and you'll find lots of frustrated perl
> users.
> 
> On the other hand I don't think adding variable length lookbehind to the
> spec makes it any easier to optimize /.+$/.
> 
> --
> Erik Corry

# Nozomu Katō (10 years ago)

(edit incorporated into above post)

Sorry,

I wrote:
> Since there was a comment about Perl5 style vs .NET style when I first
> posted my proposal to es-discuss,

I meant "when I first posted my proposal to es-discuss, too, ".

# Waldemar Horwat (10 years ago)

On 10/09/2015 15:07, Nozomu Katō wrote:

As Jason Orendorff wrote before, the lookbehind supported by .NET is a strict superset of what I have proposed. So, if you or someone else submits another lookbehind proposal based on .NET and it supersedes my proposal in a later version of ECMAScript, from the point of view of users, that would look like just an enhancement.

It's not a superset. Captures would match differently.

On 10/09/2015 15:07, Nozomu Katō wrote:
> Since there was a comment about Perl5 style vs .NET style when I first
> posted my proposal to es-discuss, I just wanted to explain about the
> background of my proposal. I proposed Perl5 compatible lookbehinds
> because I thought it was relatively simple to implement. Moreover, I am
> not confident that I can write a lookbehind proposal based on .NET
> implementation, in the manner used in the ECMAScript specification.
>
> As Jason Orendorff wrote before, the lookbehind supported by .NET is a
> strict superset of what I have proposed. So, if you or someone else
> submits another lookbehind proposal based on .NET and it supersedes my
> proposal in a later version of ECMAScript, from the point of view of
> users, that would look like just an enhancement.

It's not a superset.  Captures would match differently.

     Waldemar

# Nozomu Katō (10 years ago)

Hmm... I am getting confused. For now I wait for a decision of TC39 on my proposal.

Waldemar Horwat wrote on Fri, 9 Oct 2015, at 15:47:08 -0700:
> On 10/09/2015 15:07, Nozomu Katō wrote:
>> As Jason Orendorff wrote before, the lookbehind supported by .NET is a
>> strict superset of what I have proposed. So, if you or someone else
>> submits another lookbehind proposal based on .NET and it supersedes my
>> proposal in a later version of ECMAScript, from the point of view of
>> users, that would look like just an enhancement.
> 
> It's not a superset.  Captures would match differently.

Hmm... I am getting confused. For now I wait for a decision of TC39 on
my proposal.

Nozomu

# Erik Corry (10 years ago)

On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat <waldemar at google.com> wrote:

It's not a superset. Captures would match differently.

Can you elaborate? How would they be different?

On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat <waldemar at google.com>
wrote:
>
> It's not a superset.  Captures would match differently.


Can you elaborate?  How would they be different?

-- 
Erik Corry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151010/3c054726/attachment.html>

# Erik Corry (10 years ago)

Just for the lulz I ran the tests I could find from perl5 (which I think is very similar to the proposal here) and the captures were identical when using .Net-style reverse capturing. It's not a huge number of tests, though.

Just for the lulz I ran the tests I could find from perl5 (which I think is
very similar to the proposal here) and the captures were identical when
using .Net-style reverse capturing.  It's not a huge number of tests,
though.

On Sat, Oct 10, 2015 at 12:48 PM, Erik Corry <erik.corry at gmail.com> wrote:

>
>
> On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat <waldemar at google.com>
> wrote:
>>
>> It's not a superset.  Captures would match differently.
>
>
> Can you elaborate?  How would they be different?
>
> --
> Erik Corry
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151012/5ead311d/attachment.html>

# Waldemar Horwat (10 years ago)

On 10/10/2015 03:48, Erik Corry wrote:

Can you elaborate? How would they be different?

If you have a capture inside a loop (controlled, say, by {n}), one of the proposals would capture the first instance, while the other proposal would capture the last instance.

On 10/10/2015 03:48, Erik Corry wrote:
>
>
> On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat <waldemar at google.com <mailto:waldemar at google.com>> wrote:
>
>     It's not a superset.  Captures would match differently.
>
>
> Can you elaborate?  How would they be different?

If you have a capture inside a loop (controlled, say, by {n}), one of the proposals would capture the first instance, while the other proposal would capture the last instance.

     Waldemar

# Erik Corry (10 years ago)

Yes, that makes sense.

This could be fixed by removing {n} loops from positive lookbehinds. Or by doing the .NET-style back-references immediately.

Yes, that makes sense.

This could be fixed by removing {n} loops from positive lookbehinds.  Or by
doing the .NET-style back-references immediately.

On Mon, Oct 12, 2015 at 10:01 PM, Waldemar Horwat <waldemar at google.com>
wrote:

> On 10/10/2015 03:48, Erik Corry wrote:
>
>>
>>
>> On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat <waldemar at google.com
>> <mailto:waldemar at google.com>> wrote:
>>
>>     It's not a superset.  Captures would match differently.
>>
>>
>> Can you elaborate?  How would they be different?
>>
>
> If you have a capture inside a loop (controlled, say, by {n}), one of the
> proposals would capture the first instance, while the other proposal would
> capture the last instance.
>
>     Waldemar
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151013/70ba6f04/attachment.html>

# Waldemar Horwat (10 years ago)

On 10/13/2015 02:18, Erik Corry wrote:

Yes, that makes sense.

This could be fixed by removing {n} loops from positive lookbehinds. Or by doing the .NET-style back-references immediately.

I think it would be cleanest to do the full reverse-order matching (what I think you're calling .NET-style) from the start.

 Waldemar

On 10/13/2015 02:18, Erik Corry wrote:
> Yes, that makes sense.
>
> This could be fixed by removing {n} loops from positive lookbehinds.  Or by doing the .NET-style back-references immediately.

I think it would be cleanest to do the full reverse-order matching (what I think you're calling .NET-style) from the start.

     Waldemar

# Nozomu Katoo (10 years ago)

Erik Corry wrote on Tue, 13 Oct 2015 at 11:18:48 +0200:

Yes, that makes sense.

This could be fixed by removing {n} loops from positive lookbehinds. Or by doing the .NET-style back-references immediately.

Personally, I am reluctant to remove any feature from the current proposal intentionally for a future proposal that it is uncertain whether it really comes or not. It might end up only making lookbehinds of ECMAScript and ones of Perl 5 incompatible.

On 10/10/2015 03:48, Erik Corry wrote:
On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat wrote:
It's not a superset.  Captures would match differently.
Can you elaborate? How would they be different?
If you have a capture inside a loop (controlled, say, by {n}), one of the proposals would capture the first instance, while the other proposal would capture the last instance.

I was missing that point. I just confirmed that

perl -e "$a = 'abcdef'; $a =~ /(?<=.(.){2}.)./; print $1;"

returned 'c' whereas .NET returned 'b'. Implementation based on my proposal would return the same result as Perl 5.

By the way, at one point in this thread, I moved some email addresses from To to Cc when sending my reply. But somehow several of them had disappeared from the Cc field in the delivered email while they all remain in a copy in my sent-email folder. I apologize to those who received disconnected emails in this thread.

, Nozomu

Erik Corry wrote on Tue, 13 Oct 2015 at 11:18:48 +0200:
> Yes, that makes sense.
> 
> This could be fixed by removing {n} loops from positive lookbehinds.  Or by
> doing the .NET-style back-references immediately.

Personally, I am reluctant to remove any feature from the current
proposal intentionally for a future proposal that it is uncertain
whether it really comes or not. It might end up only making lookbehinds
of ECMAScript and ones of Perl 5 incompatible.

>> On 10/10/2015 03:48, Erik Corry wrote:
>>
>>>
>>> On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat wrote:
>>>
>>>     It's not a superset.  Captures would match differently.
>>>
>>>
>>> Can you elaborate?  How would they be different?
>>>
>>
>> If you have a capture inside a loop (controlled, say, by {n}), one of the
>> proposals would capture the first instance, while the other proposal would
>> capture the last instance.

I was missing that point. I just confirmed that

  perl -e "$a = 'abcdef'; $a =~ /(?<=.(.){2}.)./; print $1;"

returned 'c' whereas .NET returned 'b'. Implementation based on my
proposal would return the same result as Perl 5.

By the way, at one point in this thread, I moved some email addresses
from To to Cc when sending my reply. But somehow several of them had
disappeared from the Cc field in the delivered email while they all
remain in a copy in my sent-email folder. I apologize to those who
received disconnected emails in this thread.

Regards,
  Nozomu

# Erik Corry (10 years ago)

I made a playground where you can try out regexps with lookbehind.

dartpad.dartlang.org/8feea83c01ab767acdf1

I made a playground where you can try out regexps with lookbehind.

https://dartpad.dartlang.org/8feea83c01ab767acdf1

On Tue, Oct 13, 2015 at 9:24 PM, Nozomu Katoo <noz.ka at akenotsuki.com> wrote:

> Erik Corry wrote on Tue, 13 Oct 2015 at 11:18:48 +0200:
> > Yes, that makes sense.
> >
> > This could be fixed by removing {n} loops from positive lookbehinds.  Or
> by
> > doing the .NET-style back-references immediately.
>
> Personally, I am reluctant to remove any feature from the current
> proposal intentionally for a future proposal that it is uncertain
> whether it really comes or not. It might end up only making lookbehinds
> of ECMAScript and ones of Perl 5 incompatible.
>
> >> On 10/10/2015 03:48, Erik Corry wrote:
> >>
> >>>
> >>> On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat wrote:
> >>>
> >>>     It's not a superset.  Captures would match differently.
> >>>
> >>>
> >>> Can you elaborate?  How would they be different?
> >>>
> >>
> >> If you have a capture inside a loop (controlled, say, by {n}), one of
> the
> >> proposals would capture the first instance, while the other proposal
> would
> >> capture the last instance.
>
> I was missing that point. I just confirmed that
>
>   perl -e "$a = 'abcdef'; $a =~ /(?<=.(.){2}.)./; print $1;"
>
> returned 'c' whereas .NET returned 'b'. Implementation based on my
> proposal would return the same result as Perl 5.
>
>
> By the way, at one point in this thread, I moved some email addresses
> from To to Cc when sending my reply. But somehow several of them had
> disappeared from the Cc field in the delivered email while they all
> remain in a copy in my sent-email folder. I apologize to those who
> received disconnected emails in this thread.
>
> Regards,
>   Nozomu
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151110/861f488c/attachment-0001.html>

# Erik Corry (10 years ago)

And here's a similar playground for .Net, not by me:

www.regexplanet.com/advanced/dotnet/index.html

And here's a similar playground for .Net, not by me:

http://www.regexplanet.com/advanced/dotnet/index.html

On Tue, Nov 10, 2015 at 11:08 AM, Erik Corry <erik.corry at gmail.com> wrote:

> I made a playground where you can try out regexps with lookbehind.
>
> https://dartpad.dartlang.org/8feea83c01ab767acdf1
>
> On Tue, Oct 13, 2015 at 9:24 PM, Nozomu Katoo <noz.ka at akenotsuki.com>
> wrote:
>
>> Erik Corry wrote on Tue, 13 Oct 2015 at 11:18:48 +0200:
>> > Yes, that makes sense.
>> >
>> > This could be fixed by removing {n} loops from positive lookbehinds.
>> Or by
>> > doing the .NET-style back-references immediately.
>>
>> Personally, I am reluctant to remove any feature from the current
>> proposal intentionally for a future proposal that it is uncertain
>> whether it really comes or not. It might end up only making lookbehinds
>> of ECMAScript and ones of Perl 5 incompatible.
>>
>> >> On 10/10/2015 03:48, Erik Corry wrote:
>> >>
>> >>>
>> >>> On Sat, Oct 10, 2015 at 12:47 AM, Waldemar Horwat wrote:
>> >>>
>> >>>     It's not a superset.  Captures would match differently.
>> >>>
>> >>>
>> >>> Can you elaborate?  How would they be different?
>> >>>
>> >>
>> >> If you have a capture inside a loop (controlled, say, by {n}), one of
>> the
>> >> proposals would capture the first instance, while the other proposal
>> would
>> >> capture the last instance.
>>
>> I was missing that point. I just confirmed that
>>
>>   perl -e "$a = 'abcdef'; $a =~ /(?<=.(.){2}.)./; print $1;"
>>
>> returned 'c' whereas .NET returned 'b'. Implementation based on my
>> proposal would return the same result as Perl 5.
>>
>>
>> By the way, at one point in this thread, I moved some email addresses
>> from To to Cc when sending my reply. But somehow several of them had
>> disappeared from the Cc field in the delivered email while they all
>> remain in a copy in my sent-email folder. I apologize to those who
>> received disconnected emails in this thread.
>>
>> Regards,
>>   Nozomu
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151111/68c4bad2/attachment.html>

# Yang Guo (10 years ago)

The experimental implementation [0] in V8 landed a few days ago and is included in the latest Canary build (49.0.2568.0 and later). You can test it after enabling it with the command line flag --js-flags="--harmony-regexp-lookbehind". Apparently the port to SpiderMonkey is already underway [1].

This implementation supports variable length lookbehind similar to .NET's semantics. It does so by emitting code to read backwards inside the lookbehind. The size of the change without platform ports and tests is about 600 lines.

[0] codereview.chromium.org/1418963009 [1] bugzilla.mozilla.org/show_bug.cgi?id=1225665

The experimental implementation [0] in V8 landed a few days ago and is
included in the latest Canary build (49.0.2568.0 and later).
You can test it after enabling it with the command line flag
--js-flags="--harmony-regexp-lookbehind".
Apparently the port to SpiderMonkey is already underway [1].

This implementation supports variable length lookbehind similar to .NET's
semantics. It does so by emitting code to read backwards inside the
lookbehind. The size of the change without platform ports and tests is
about 600 lines.

[0] https://codereview.chromium.org/1418963009/
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1225665
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151119/faa6b51a/attachment.html>