Oddly accepted RegExps

# Isiah Meadows (8 years ago)

These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox. They are listed here with equivalent regexps:

  • /[[]/ -> /\[\[\]/
  • /[]]/ -> /(?!)/ (i.e. nothing)
  • /a{,,/ -> /a\{,,+/

Is this a spec bug or an implementation bug in the parsing?

# Jeremy Darling (8 years ago)

The first and the last are def vaild and I've used the first as a partial plenty of times.

Basically [[] is the same as saying /[/ just a little bit longer, match a single character from the set [. /[[]/.exec('[]') -> '['

There is nothing special about /a{,,/ its just a normal match these characters in this order. /a{,,/.exec('a{,,') -> 'a{,,'

/[]]/ This one throws me, that should require the first ] to be escaped (]) to be useful. I can see it parse and accept but have no clue why or what it would do. It should throw an error.

# Bergi (8 years ago)

Jeremy Darling wrote:

/[]]/ This one throws me, that should require the first ] to be escaped (]) to be useful. I can see it parse and accept but have no clue why or what it would do. It should throw an error.

I can't see it accept anything. Afaics, it's equivalent to /[]]/ - which contains an empty class that never matches anything, which is followed by a literal "]".

Kind , Bergi

# Boris Zbarsky (8 years ago)

On 6/3/16 4:20 AM, Isiah Meadows wrote:

These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox.

Note that Chrome and Firefox use the same regexp implementation, so them agreeing on how a regexp is handled means a lot less than if two independent implementations agreed.

# Andy Earnshaw (8 years ago)

IE has supported all of these for as long as I can remember. AFAIK, it's never been a requirement in browsers to escape [ inside a character class or ] outside e.g. /[[]/ ([ is inside) or /[]]/ (] is outside). If it's not the case in the spec (I haven't checked the spec grammar), it should probably be classed as a spec bug for compat reasons.

# Mike Samuel (8 years ago)

Older versions of IE did not support [^] as a way of saying any char as I discovered when writing minified passes so I'm surprised to hear that IE has consistently supported [].

# Michael Saboff (8 years ago)

JavaScriptCore / Safari supports these three RegExp's. Like the other implementations, I don’t think the second matches anything.

# Claude Pache (8 years ago)

Le 3 juin 2016 à 10:20, Isiah Meadows <isiahmeadows at gmail.com> a écrit :

These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox. They are listed here with equivalent regexps:

  • /[[]/ -> /\[\[\]/
  • /[]]/ -> /(?!)/ (i.e. nothing)
  • /a{,,/ -> /a\{,,+/

Is this a spec bug or an implementation bug in the parsing?

The first pattern is conform to the syntax and semantics given in the main part of the spec. The most relevant rule in the grammar of tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

ClassAtomNoDash ::
    SourceCharacter  but not one of \ or ] or -

In particular an unescaped [ is an acceptable atom inside a character class.

The last two ones are well specified by main part modified with annex b. The second pattern starts with an empty class, which is a valid way to not match anything. And the most relevant rule in Annex B grammar in tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

ExtendedPatternCharacter::
    SourceCharacter  but not one of  ^  $  .  *  +  ?  (  )  [  |

In particular, ] and { may appear unescaped outside character class (with the restriction that { is not at the start of a sequence that resemble a quantifier, which case is taken care by the InvalidBracedQuantifier production).

# Isiah Meadows (8 years ago)

Thanks! I wasn't able to glean that from the spec. It is admittedly confusing and not very obvious, but I was just curious.