Add regular expressions lookbehind

# Sebastian Zartner (11 years ago)

I wonder if the discussion about lookbehinds1 and Marc Harter's proposal for them2 in the past led to anything. I'd really like to see these implemented in ECMAScript specification and it seems I am not the only one. 3, 4, 5 This even caused people to try to mimic them.6 So I wanted to pick up the discussion again and ask, what info was missing that they didn't get specified?

# Waldemar Horwat (11 years ago)

No one has yet submitted a well-defined proposal for lookbehinds on the table. Lookbehinds are difficult to translate into the language used by the spec and get quite fuzzy when the order of evaluation of parts of the regexp matters, which is what happens if capturing parentheses are involved. Where do you start looking for the lookbehind? Shortest first, longest first, or reverse string match? Greedy or not? Backtrack into capturing results?

# Sebastian Zartner (11 years ago)

As I understand it the problem with lookbehinds are not parentheses but variable lengths. Therefore many regular expression flavors only allow fixed-length lookbehinds1, which could also be the restriction within ECMAScript for now. This restriction could then be addressed at a later point.

Backtrack into capturing results?

Seems like at least PCRE does that. for parenthesed parts within lookbehinds.

# Jason Orendorff (11 years ago)

Sebastian,

Here is how I interpret Waldemar's post:

On Mon, Sep 30, 2013 at 5:55 PM, Waldemar Horwat <waldemar at google.com> wrote:

No one has yet submitted a well-defined proposal for lookbehinds on the table. Lookbehinds are difficult to translate into the language used by the spec [...]

This is the real problem. No one has taken on the work. It's a fair amount of very technical work.

[...] and get quite fuzzy when the order of evaluation of parts of the regexp matters, which is what happens if capturing parentheses are involved. Where do you start looking for the lookbehind? Shortest first, longest first, or reverse string match? Greedy or not? Backtrack into capturing results?

Sebastian, as you pointed out, these technical points have all been addressed one way or another in practice. My guess is that TC39 would accept some commonly-implemented lookbehind behavior, if someone put up the work. I could be wrong.

# Brendan Eich (11 years ago)

I think this is right, we need help and are open to it from someone who can lift the weight. Waldemar may be able to help, I'm not sure how much (but he wrote the ES3 regexp spec, so at least as reviewer he is a key part of the solution).

Sebastian, if you have time to help, that would be tremendous.

# Sebastian Zartner (11 years ago)

I can try to help, though my time and technical knowledge about regexp implementations are limited. So wouldn't someone of the people, who already implemented a working regexp engine, be a better help? Did somebody of you ever contact the e.g. the people behind PCRE?

Sebastian

# Brendan Eich (11 years ago)

Sebastian Zartner wrote:

I can try to help, though my time and technical knowledge about regexp implementations are limited. So wouldn't someone of the people, who already implemented a working regexp engine, be a better help? Did somebody of you ever contact the e.g. the people behind PCRE?

Code and spec are different animals, although related. Also, JS knowledge (ECMA-262) matters. I've reached out to Steve Levithan, we'll see.