Proposal: Expose offsets for capturing groups in regular expression matches

# Sebastian Zartner (7 years ago)

Hello together,

for advanced processing of capturing groups in regular expression, I'd like to propose to expose their offsets within the results of executing an expression on a string.

The complete proposal can be found at SebastianZ/es-proposal-regexp-capturing-group-offsets.

I'd like it to be added to the Stage 0 proposals tc39/proposals/blob/master/stage-0-proposals.md and

I'm asking for feedback and a champion to help me bring it into shape and get it into the standard.

Thank you in advance,

Sebastian

# Erik Corry (7 years ago)

This would be great. Can I suggest that both the start and end of each match should be there. So instead of offsets you would have "starts" and "ends". Alternatively, offsets should be twice as long with start-end pairs in it.

# T.J. Crowder (7 years ago)

Excellent idea, and nice and simple as well. I wouldn't think adding a property to the match result would rattle cages, it'll be interesting to find out.

@eric - I prefer the proposal's approach to offsets. If you need to know where the end is, you can always add the length of the captured text, so the information is already there.

-- T.J. Crowder

# Jordan Harband (7 years ago)

Adding a property to the match result is indeed tricky.

Not sure if you're already aware that named capture groups are stage 3: tc39/proposal

# Sebastian Zartner (7 years ago)

and sorry for the response delay!

On 24 March 2017 at 01:47, Jordan Harband <ljharb at gmail.com> wrote:

Adding a property to the match result is indeed tricky.

Why? The match result already has the properties index and input www.ecma-international.org/ecma-262/6.0/#sec-regexpbuiltinexec.

Not sure if you're already aware that named capture groups are stage 3: tc39/proposal-regexp-named-groups

Thank you for the reference! At the time I wrote my proposal (which was the last time I looked), named captured groups were at stage 0. I read about that proposal back then, though it doesn't allow to get the related offsets, either. My proposal may be changed to work together with named captured groups (i.e. by adding the offsets to the groups property), though the offsets should also be available when the captured groups are not named.

On Thu, Mar 23, 2017 at 8:08 AM, T.J. Crowder <tj.crowder at farsightsoftware.c

om> wrote:

Excellent idea, and nice and simple as well. I wouldn't think adding a property to the match result would rattle cages, it'll be interesting to find out.

@eric - I prefer the proposal's approach to offsets. If you need to know where the end is, you can always add the length of the captured text, so the information is already there.

-- T.J. Crowder

On Thu, Mar 23, 2017 at 3:00 PM, Erik Corry <erik.corry at gmail.com> wrote:

This would be great. Can I suggest that both the start and end of each match should be there. So instead of offsets you would have "starts" and "ends". Alternatively, offsets should be twice as long with start-end pairs in it.

Initially I had an advanced approach to this SebastianZ/es-proposal-regexp-capturing-group-offsets/blob/983af857ec2dce5e3c0af5e8438ca6dc8d74c3f0/README.md

similar to the named captured groups proposal including start and end properties for each group. Though after a discussion related to String.prototype.matchAll() tc39/String.prototype.matchAll#13 due to the

fact that the end offset can easily be calculated by the start offset and the length of the captured group, I thought a simpler approach would easier get traction. Also, this approach is conformant to how other languages like ColdFusion solve this. Having said that, I am happy to discuss different approaches to this.

Sebastian

# Sebastian Zartner (7 years ago)

I've filed issue tc39/proposal-regexp-named-groups#21 to start the discussion there.

Sebastian