Proposal: Expose offsets for capturing groups in regular expression matches

# Sebastian Zartner (9 years ago)

Hello together,

for advanced processing of capturing groups in regular expression, I'd like to propose to expose their offsets within the results of executing an expression on a string.

The complete proposal can be found at SebastianZ/es-proposal-regexp-capturing-group-offsets.

I'd like it to be added to the Stage 0 proposals tc39/proposals/blob/master/stage-0-proposals.md and

I'm asking for feedback and a champion to help me bring it into shape and get it into the standard.

Thank you in advance,

Sebastian

Hello together,

for advanced processing of capturing groups in regular expression, I'd like
to propose to expose their offsets within the results of executing an
expression on a string.

The complete proposal can be found at
https://github.com/SebastianZ/es-proposal-regexp-capturing-group-offsets.

I'd like it to be added to the Stage 0 proposals
<https://github.com/tc39/proposals/blob/master/stage-0-proposals.md> and
I'm asking for feedback and a champion to help me bring it into shape and
get it into the standard.

Thank you in advance,

Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20161031/c3810bf0/attachment.html>

# Erik Corry (8 years ago)

This would be great. Can I suggest that both the start and end of each match should be there. So instead of offsets you would have "starts" and "ends". Alternatively, offsets should be twice as long with start-end pairs in it.

This would be great.  Can I suggest that both the start and end of each
match should be there.  So instead of offsets you would have "starts" and
"ends".  Alternatively, offsets should be twice as long with start-end
pairs in it.

On Mon, Oct 31, 2016 at 9:53 AM, Sebastian Zartner <
sebastianzartner at gmail.com> wrote:

> Hello together,
>
> for advanced processing of capturing groups in regular expression, I'd
> like to propose to expose their offsets within the results of executing an
> expression on a string.
>
> The complete proposal can be found at https://github.com/SebastianZ/
> es-proposal-regexp-capturing-group-offsets.
>
> I'd like it to be added to the Stage 0 proposals
> <https://github.com/tc39/proposals/blob/master/stage-0-proposals.md> and
> I'm asking for feedback and a champion to help me bring it into shape and
> get it into the standard.
>
> Thank you in advance,
>
> Sebastian
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170323/d7747504/attachment.html>

# T.J. Crowder (8 years ago)

Excellent idea, and nice and simple as well. I wouldn't think adding a property to the match result would rattle cages, it'll be interesting to find out.

@eric - I prefer the proposal's approach to offsets. If you need to know where the end is, you can always add the length of the captured text, so the information is already there.

-- T.J. Crowder

Excellent idea, and nice and simple as well. I wouldn't think adding a
property to the match result would rattle cages, it'll be interesting to
find out.

@eric - I prefer the proposal's approach to offsets. If you need to know
where the end is, you can always add the length of the captured text, so
the information is already there.

-- T.J. Crowder

On Thu, Mar 23, 2017 at 3:00 PM, Erik Corry <erik.corry at gmail.com> wrote:

> This would be great.  Can I suggest that both the start and end of each
> match should be there.  So instead of offsets you would have "starts" and
> "ends".  Alternatively, offsets should be twice as long with start-end
> pairs in it.
>
> On Mon, Oct 31, 2016 at 9:53 AM, Sebastian Zartner <
> sebastianzartner at gmail.com> wrote:
>
>> Hello together,
>>
>> for advanced processing of capturing groups in regular expression, I'd
>> like to propose to expose their offsets within the results of executing an
>> expression on a string.
>>
>> The complete proposal can be found at https://github.com/SebastianZ/
>> es-proposal-regexp-capturing-group-offsets.
>>
>> I'd like it to be added to the Stage 0 proposals
>> <https://github.com/tc39/proposals/blob/master/stage-0-proposals.md> and
>> I'm asking for feedback and a champion to help me bring it into shape and
>> get it into the standard.
>>
>> Thank you in advance,
>>
>> Sebastian
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170323/f4d8b18b/attachment.html>

# Jordan Harband (8 years ago)

Adding a property to the match result is indeed tricky.

Not sure if you're already aware that named capture groups are stage 3: tc39/proposal

Adding a property to the match result is indeed tricky.


Not sure if you're already aware that named capture groups are stage 3:
https://github.com/tc39/proposal-regexp-named-groups

On Thu, Mar 23, 2017 at 8:08 AM, T.J. Crowder <
tj.crowder at farsightsoftware.com> wrote:

> Excellent idea, and nice and simple as well. I wouldn't think adding a
> property to the match result would rattle cages, it'll be interesting to
> find out.
>
> @eric - I prefer the proposal's approach to offsets. If you need to know
> where the end is, you can always add the length of the captured text, so
> the information is already there.
>
> -- T.J. Crowder
>
> On Thu, Mar 23, 2017 at 3:00 PM, Erik Corry <erik.corry at gmail.com> wrote:
>
>> This would be great.  Can I suggest that both the start and end of each
>> match should be there.  So instead of offsets you would have "starts" and
>> "ends".  Alternatively, offsets should be twice as long with start-end
>> pairs in it.
>>
>> On Mon, Oct 31, 2016 at 9:53 AM, Sebastian Zartner <
>> sebastianzartner at gmail.com> wrote:
>>
>>> Hello together,
>>>
>>> for advanced processing of capturing groups in regular expression, I'd
>>> like to propose to expose their offsets within the results of executing an
>>> expression on a string.
>>>
>>> The complete proposal can be found at https://github.com/SebastianZ/
>>> es-proposal-regexp-capturing-group-offsets.
>>>
>>> I'd like it to be added to the Stage 0 proposals
>>> <https://github.com/tc39/proposals/blob/master/stage-0-proposals.md>
>>> and I'm asking for feedback and a champion to help me bring it into shape
>>> and get it into the standard.
>>>
>>> Thank you in advance,
>>>
>>> Sebastian
>>>
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>>
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170323/135b6205/attachment.html>

# Sebastian Zartner (8 years ago)

and sorry for the response delay!

On 24 March 2017 at 01:47, Jordan Harband <ljharb at gmail.com> wrote:

Adding a property to the match result is indeed tricky.

Why? The match result already has the properties index and input www.ecma-international.org/ecma-262/6.0/#sec-regexpbuiltinexec.

Not sure if you're already aware that named capture groups are stage 3: tc39/proposal-regexp-named-groups

Thank you for the reference! At the time I wrote my proposal (which was the last time I looked), named captured groups were at stage 0. I read about that proposal back then, though it doesn't allow to get the related offsets, either. My proposal may be changed to work together with named captured groups (i.e. by adding the offsets to the groups property), though the offsets should also be available when the captured groups are not named.

On Thu, Mar 23, 2017 at 8:08 AM, T.J. Crowder <tj.crowder at farsightsoftware.c

om> wrote:

Excellent idea, and nice and simple as well. I wouldn't think adding a property to the match result would rattle cages, it'll be interesting to find out.

@eric - I prefer the proposal's approach to offsets. If you need to know where the end is, you can always add the length of the captured text, so the information is already there.

-- T.J. Crowder

On Thu, Mar 23, 2017 at 3:00 PM, Erik Corry <erik.corry at gmail.com> wrote:

This would be great. Can I suggest that both the start and end of each match should be there. So instead of offsets you would have "starts" and "ends". Alternatively, offsets should be twice as long with start-end pairs in it.

Initially I had an advanced approach to this SebastianZ/es-proposal-regexp-capturing-group-offsets/blob/983af857ec2dce5e3c0af5e8438ca6dc8d74c3f0/README.md

similar to the named captured groups proposal including start and end properties for each group. Though after a discussion related to String.prototype.matchAll() tc39/String.prototype.matchAll#13 due to the

fact that the end offset can easily be calculated by the start offset and the length of the captured group, I thought a simpler approach would easier get traction. Also, this approach is conformant to how other languages like ColdFusion solve this. Having said that, I am happy to discuss different approaches to this.

Sebastian

Hi, and sorry for the response delay!

On 24 March 2017 at 01:47, Jordan Harband <ljharb at gmail.com> wrote:

> Adding a property to the match result is indeed tricky.
>

Why? The match result already has the properties index and input
<http://www.ecma-international.org/ecma-262/6.0/#sec-regexpbuiltinexec>.


> Not sure if you're already aware that named capture groups are stage 3:
> https://github.com/tc39/proposal-regexp-named-groups
>

Thank you for the reference! At the time I wrote my proposal (which was the
last time I looked), named captured groups were at stage 0. I read about
that proposal back then, though it doesn't allow to get the related
offsets, either.
My proposal may be changed to work together with named captured groups
(i.e. by adding the offsets to the groups property), though the offsets
should also be available when the captured groups are not named.

On Thu, Mar 23, 2017 at 8:08 AM, T.J. Crowder <tj.crowder at farsightsoftware.c
> om> wrote:
>
>> Excellent idea, and nice and simple as well. I wouldn't think adding a
>> property to the match result would rattle cages, it'll be interesting to
>> find out.
>>
>> @eric - I prefer the proposal's approach to offsets. If you need to know
>> where the end is, you can always add the length of the captured text, so
>> the information is already there.
>>
>> -- T.J. Crowder
>>
>> On Thu, Mar 23, 2017 at 3:00 PM, Erik Corry <erik.corry at gmail.com> wrote:
>>
>>> This would be great.  Can I suggest that both the start and end of each
>>> match should be there.  So instead of offsets you would have "starts" and
>>> "ends".  Alternatively, offsets should be twice as long with start-end
>>> pairs in it.
>>>
>>
Initially I had an advanced approach to this
<https://github.com/SebastianZ/es-proposal-regexp-capturing-group-offsets/blob/983af857ec2dce5e3c0af5e8438ca6dc8d74c3f0/README.md>
similar to the named captured groups proposal including start and end
properties for each group. Though after a discussion related to
String.prototype.matchAll()
<https://github.com/tc39/String.prototype.matchAll/issues/13> due to the
fact that the end offset can easily be calculated by the start offset and
the length of the captured group, I thought a simpler approach would easier
get traction. Also, this approach is conformant to how other languages like
ColdFusion solve this.
Having said that, I am happy to discuss different approaches to this.

Sebastian


> On Mon, Oct 31, 2016 at 9:53 AM, Sebastian Zartner <
>>> sebastianzartner at gmail.com> wrote:
>>>
>>>> Hello together,
>>>>
>>>> for advanced processing of capturing groups in regular expression, I'd
>>>> like to propose to expose their offsets within the results of executing an
>>>> expression on a string.
>>>>
>>>> The complete proposal can be found at https://github.com/SebastianZ/
>>>> es-proposal-regexp-capturing-group-offsets.
>>>>
>>>> I'd like it to be added to the Stage 0 proposals
>>>> <https://github.com/tc39/proposals/blob/master/stage-0-proposals.md>
>>>> and I'm asking for feedback and a champion to help me bring it into shape
>>>> and get it into the standard.
>>>>
>>>> Thank you in advance,
>>>>
>>>> Sebastian
>>>>
>>>> _______________________________________________
>>>> es-discuss mailing list
>>>> es-discuss at mozilla.org
>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>
>>>>
>>>
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>>
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170330/eea18f8a/attachment.html>

# Sebastian Zartner (8 years ago)

I've filed issue tc39/proposal-regexp-named-groups#21 to start the discussion there.

Sebastian

I've filed issue
https://github.com/tc39/proposal-regexp-named-groups/issues/21 to start the
discussion there.

Sebastian

On 30 March 2017 at 12:25, Daniel Ehrenberg <littledan at chromium.org> wrote:

> I've been trying to organize feedback to the named captures proposal in
> bugs on the repository in GitHub at https://github.com/tc39/
> proposal-regexp-named-groups/issues . I'd be happy to have your input.
> This proposal's implementation is in progress in V8.
>
> Dan
>
> On Mar 30, 2017 10:29 AM, "Sebastian Zartner" <sebastianzartner at gmail.com>
> wrote:
>
>> Hi, and sorry for the response delay!
>>
>> On 24 March 2017 at 01:47, Jordan Harband <ljharb at gmail.com> wrote:
>>
>>> Adding a property to the match result is indeed tricky.
>>>
>>
>> Why? The match result already has the properties index and input
>> <http://www.ecma-international.org/ecma-262/6.0/#sec-regexpbuiltinexec>.
>>
>>
>>> Not sure if you're already aware that named capture groups are stage 3:
>>> https://github.com/tc39/proposal-regexp-named-groups
>>>
>>
>> Thank you for the reference! At the time I wrote my proposal (which was
>> the last time I looked), named captured groups were at stage 0. I read
>> about that proposal back then, though it doesn't allow to get the related
>> offsets, either.
>> My proposal may be changed to work together with named captured groups
>> (i.e. by adding the offsets to the groups property), though the offsets
>> should also be available when the captured groups are not named.
>>
>> On Thu, Mar 23, 2017 at 8:08 AM, T.J. Crowder <
>>> tj.crowder at farsightsoftware.com> wrote:
>>>
>>>> Excellent idea, and nice and simple as well. I wouldn't think adding a
>>>> property to the match result would rattle cages, it'll be interesting to
>>>> find out.
>>>>
>>>> @eric - I prefer the proposal's approach to offsets. If you need to
>>>> know where the end is, you can always add the length of the captured text,
>>>> so the information is already there.
>>>>
>>>> -- T.J. Crowder
>>>>
>>>> On Thu, Mar 23, 2017 at 3:00 PM, Erik Corry <erik.corry at gmail.com>
>>>> wrote:
>>>>
>>>>> This would be great.  Can I suggest that both the start and end of
>>>>> each match should be there.  So instead of offsets you would have "starts"
>>>>> and "ends".  Alternatively, offsets should be twice as long with start-end
>>>>> pairs in it.
>>>>>
>>>>
>> Initially I had an advanced approach to this
>> <https://github.com/SebastianZ/es-proposal-regexp-capturing-group-offsets/blob/983af857ec2dce5e3c0af5e8438ca6dc8d74c3f0/README.md>
>> similar to the named captured groups proposal including start and end
>> properties for each group. Though after a discussion related to
>> String.prototype.matchAll()
>> <https://github.com/tc39/String.prototype.matchAll/issues/13> due to the
>> fact that the end offset can easily be calculated by the start offset and
>> the length of the captured group, I thought a simpler approach would easier
>> get traction. Also, this approach is conformant to how other languages like
>> ColdFusion solve this.
>> Having said that, I am happy to discuss different approaches to this.
>>
>> Sebastian
>>
>>
>>> On Mon, Oct 31, 2016 at 9:53 AM, Sebastian Zartner <
>>>>> sebastianzartner at gmail.com> wrote:
>>>>>
>>>>>> Hello together,
>>>>>>
>>>>>> for advanced processing of capturing groups in regular expression,
>>>>>> I'd like to propose to expose their offsets within the results of executing
>>>>>> an expression on a string.
>>>>>>
>>>>>> The complete proposal can be found at https://github.com/SebastianZ/
>>>>>> es-proposal-regexp-capturing-group-offsets.
>>>>>>
>>>>>> I'd like it to be added to the Stage 0 proposals
>>>>>> <https://github.com/tc39/proposals/blob/master/stage-0-proposals.md>
>>>>>> and I'm asking for feedback and a champion to help me bring it into shape
>>>>>> and get it into the standard.
>>>>>>
>>>>>> Thank you in advance,
>>>>>>
>>>>>> Sebastian
>>>>>>
>>>>>> _______________________________________________
>>>>>> es-discuss mailing list
>>>>>> es-discuss at mozilla.org
>>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> es-discuss mailing list
>>>>> es-discuss at mozilla.org
>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> es-discuss mailing list
>>>> es-discuss at mozilla.org
>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>
>>>>
>>>
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20170330/df078930/attachment.html>