Oddly accepted RegExps

# Isiah Meadows (9 years ago)

These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox. They are listed here with equivalent regexps:

/[[]/ -> /\[\[\]/
/[]]/ -> /(?!)/ (i.e. nothing)
/a{,,/ -> /a\{,,+/

Is this a spec bug or an implementation bug in the parsing?

These three RegExps don't appear valid, even after reading the Annex B, but
they do behave consistently in both Chrome and Firefox. They are listed
here with equivalent regexps:

- `/[[]/` -> `/\[\[\]/`
- `/[]]/` -> `/(?!)/` (i.e. nothing)
- `/a{,,/` -> `/a\{,,+/`

Is this a spec bug or an implementation bug in the parsing?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20160603/ae0fc0b1/attachment.html>

# Jeremy Darling (9 years ago)

The first and the last are def vaild and I've used the first as a partial plenty of times.

Basically [[] is the same as saying /[/ just a little bit longer, match a single character from the set [. /[[]/.exec('[]') -> '['

There is nothing special about /a{,,/ its just a normal match these characters in this order. /a{,,/.exec('a{,,') -> 'a{,,'

/[]]/ This one throws me, that should require the first ] to be escaped (]) to be useful. I can see it parse and accept but have no clue why or what it would do. It should throw an error.

The first and the last are def vaild and I've used the first as a partial
plenty of times.

Basically [[] is the same as saying /\[/ just a little bit longer, match a
single character from the set [.
/[[]/.exec('[]') -> '['

There is nothing special about /a{,,/ its just a normal match these
characters in this order.
/a{,,/.exec('a{,,') -> 'a{,,'

/[]]/ This one throws me, that should require the first ] to be escaped
(\]) to be useful.  I can see it parse and accept but have no clue why or
what it would do.  It should throw an error.

On Fri, Jun 3, 2016 at 3:20 AM, Isiah Meadows <isiahmeadows at gmail.com>
wrote:

> These three RegExps don't appear valid, even after reading the Annex B,
> but they do behave consistently in both Chrome and Firefox. They are listed
> here with equivalent regexps:
>
> - `/[[]/` -> `/\[\[\]/`
> - `/[]]/` -> `/(?!)/` (i.e. nothing)
> - `/a{,,/` -> `/a\{,,+/`
>
> Is this a spec bug or an implementation bug in the parsing?
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20160603/9a967987/attachment.html>

# Bergi (9 years ago)

Jeremy Darling wrote:

/[]]/ This one throws me, that should require the first ] to be escaped (]) to be useful. I can see it parse and accept but have no clue why or what it would do. It should throw an error.

I can't see it accept anything. Afaics, it's equivalent to /[]]/ - which contains an empty class that never matches anything, which is followed by a literal "]".

Kind , Bergi

Jeremy Darling wrote:

> /[]]/ This one throws me, that should require the first ] to be escaped
> (\]) to be useful.  I can see it parse and accept but have no clue why or
> what it would do.  It should throw an error.

I can't see it accept anything. Afaics, it's equivalent to /[]\]/ - 
which contains an empty class that never matches anything, which is 
followed by a literal "]".

Kind regards,
  Bergi

# Boris Zbarsky (9 years ago)

On 6/3/16 4:20 AM, Isiah Meadows wrote:

These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox.

Note that Chrome and Firefox use the same regexp implementation, so them agreeing on how a regexp is handled means a lot less than if two independent implementations agreed.

On 6/3/16 4:20 AM, Isiah Meadows wrote:
> These three RegExps don't appear valid, even after reading the Annex B,
> but they do behave consistently in both Chrome and Firefox.

Note that Chrome and Firefox use the same regexp implementation, so them 
agreeing on how a regexp is handled means a lot less than if two 
independent implementations agreed.

-Boris

# Andy Earnshaw (9 years ago)

IE has supported all of these for as long as I can remember. AFAIK, it's never been a requirement in browsers to escape [ inside a character class or ] outside e.g. /[[]/ ([ is inside) or /[]]/ (] is outside). If it's not the case in the spec (I haven't checked the spec grammar), it should probably be classed as a spec bug for compat reasons.

IE has supported all of these for as long as I can remember.  AFAIK, it's
never been a requirement _in browsers_ to escape [ inside a character class
or ] outside e.g. `/[[]/` ([ is inside) or `/[]]/` (] is outside).  If it's
not the case in the spec (I haven't checked the spec grammar), it should
probably be classed as  a spec bug for compat reasons.

On Fri, 3 Jun 2016 at 14:20 Boris Zbarsky <bzbarsky at mit.edu> wrote:

> On 6/3/16 4:20 AM, Isiah Meadows wrote:
> > These three RegExps don't appear valid, even after reading the Annex B,
> > but they do behave consistently in both Chrome and Firefox.
>
> Note that Chrome and Firefox use the same regexp implementation, so them
> agreeing on how a regexp is handled means a lot less than if two
> independent implementations agreed.
>
> -Boris
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20160603/b9011895/attachment.html>

# Mike Samuel (9 years ago)

Older versions of IE did not support [^] as a way of saying any char as I discovered when writing minified passes so I'm surprised to hear that IE has consistently supported [].

Older versions of IE did not support [^] as a way of saying any char as I
discovered when writing minified passes so I'm surprised to hear that IE
has consistently supported [].
On Jun 3, 2016 9:48 AM, "Andy Earnshaw" <andyearnshaw at gmail.com> wrote:

> IE has supported all of these for as long as I can remember.  AFAIK, it's
> never been a requirement _in browsers_ to escape [ inside a character class
> or ] outside e.g. `/[[]/` ([ is inside) or `/[]]/` (] is outside).  If it's
> not the case in the spec (I haven't checked the spec grammar), it should
> probably be classed as  a spec bug for compat reasons.
>
> On Fri, 3 Jun 2016 at 14:20 Boris Zbarsky <bzbarsky at mit.edu> wrote:
>
>> On 6/3/16 4:20 AM, Isiah Meadows wrote:
>> > These three RegExps don't appear valid, even after reading the Annex B,
>> > but they do behave consistently in both Chrome and Firefox.
>>
>> Note that Chrome and Firefox use the same regexp implementation, so them
>> agreeing on how a regexp is handled means a lot less than if two
>> independent implementations agreed.
>>
>> -Boris
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20160603/184a5e23/attachment.html>

# Michael Saboff (9 years ago)

JavaScriptCore / Safari supports these three RegExp's. Like the other implementations, I don’t think the second matches anything.

JavaScriptCore / Safari supports these three RegExp's.  Like the other implementations, I don’t think the second matches anything.

- Michael

> On Jun 3, 2016, at 6:48 AM, Andy Earnshaw <andyearnshaw at gmail.com> wrote:
> 
> IE has supported all of these for as long as I can remember.  AFAIK, it's never been a requirement _in browsers_ to escape [ inside a character class or ] outside e.g. `/[[]/` ([ is inside) or `/[]]/` (] is outside).  If it's not the case in the spec (I haven't checked the spec grammar), it should probably be classed as  a spec bug for compat reasons.
> 
> On Fri, 3 Jun 2016 at 14:20 Boris Zbarsky <bzbarsky at mit.edu <mailto:bzbarsky at mit.edu>> wrote:
> On 6/3/16 4:20 AM, Isiah Meadows wrote:
> > These three RegExps don't appear valid, even after reading the Annex B,
> > but they do behave consistently in both Chrome and Firefox.
> 
> Note that Chrome and Firefox use the same regexp implementation, so them
> agreeing on how a regexp is handled means a lot less than if two
> independent implementations agreed.
> 
> -Boris
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org <mailto:es-discuss at mozilla.org>
> https://mail.mozilla.org/listinfo/es-discuss <https://mail.mozilla.org/listinfo/es-discuss>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20160603/081e0a36/attachment-0001.html>

# Claude Pache (9 years ago)

Le 3 juin 2016 à 10:20, Isiah Meadows <isiahmeadows at gmail.com> a écrit :

These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox. They are listed here with equivalent regexps:

/[[]/ -> /\[\[\]/

/[]]/ -> /(?!)/ (i.e. nothing)

/a{,,/ -> /a\{,,+/

Is this a spec bug or an implementation bug in the parsing?

The first pattern is conform to the syntax and semantics given in the main part of the spec. The most relevant rule in the grammar of tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

ClassAtomNoDash ::
    SourceCharacter  but not one of \ or ] or -

In particular an unescaped [ is an acceptable atom inside a character class.

The last two ones are well specified by main part modified with annex b. The second pattern starts with an empty class, which is a valid way to not match anything. And the most relevant rule in Annex B grammar in tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

ExtendedPatternCharacter::
    SourceCharacter  but not one of  ^  $  .  *  +  ?  (  )  [  |

In particular, ] and { may appear unescaped outside character class (with the restriction that { is not at the start of a sequence that resemble a quantifier, which case is taken care by the InvalidBracedQuantifier production).

> Le 3 juin 2016 à 10:20, Isiah Meadows <isiahmeadows at gmail.com> a écrit :
> 
> These three RegExps don't appear valid, even after reading the Annex B, but they do behave consistently in both Chrome and Firefox. They are listed here with equivalent regexps:
> 
> - `/[[]/` -> `/\[\[\]/`
> - `/[]]/` -> `/(?!)/` (i.e. nothing)
> - `/a{,,/` -> `/a\{,,+/`
> 
> Is this a spec bug or an implementation bug in the parsing? 

The first pattern is conform to the syntax and semantics given in the main part of the spec. The most relevant rule in the grammar of https://tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

    ClassAtomNoDash ::
        SourceCharacter  but not one of \ or ] or -
    
In particular an unescaped `[` is an acceptable atom inside a character class.

The last two ones are well specified by main part modified with annex b. The second pattern starts with an empty class, which is a valid way to not match anything. And the most relevant rule in Annex B grammar in https://tc39.github.io/ecma262/#sec-regular-expressions-patterns is:

    ExtendedPatternCharacter::
        SourceCharacter  but not one of  ^  $  .  *  +  ?  (  )  [  |

In particular, `]` and `{` may appear unescaped outside character class (with the restriction that `{` is not at the start of a sequence that resemble a quantifier, which case is taken care by the `InvalidBracedQuantifier` production).

—Claude

# Isiah Meadows (9 years ago)

Thanks! I wasn't able to glean that from the spec. It is admittedly confusing and not very obvious, but I was just curious.

Thanks! I wasn't able to glean that from the spec. It is admittedly
confusing and not very obvious, but I was just curious.

On Fri, Jun 3, 2016, 16:41 Claude Pache <claude.pache at gmail.com> wrote:

>
> > Le 3 juin 2016 à 10:20, Isiah Meadows <isiahmeadows at gmail.com> a écrit :
> >
> > These three RegExps don't appear valid, even after reading the Annex B,
> but they do behave consistently in both Chrome and Firefox. They are listed
> here with equivalent regexps:
> >
> > - `/[[]/` -> `/\[\[\]/`
> > - `/[]]/` -> `/(?!)/` (i.e. nothing)
> > - `/a{,,/` -> `/a\{,,+/`
> >
> > Is this a spec bug or an implementation bug in the parsing?
>
> The first pattern is conform to the syntax and semantics given in the main
> part of the spec. The most relevant rule in the grammar of
> https://tc39.github.io/ecma262/#sec-regular-expressions-patterns is:
>
>     ClassAtomNoDash ::
>         SourceCharacter  but not one of \ or ] or -
>
> In particular an unescaped `[` is an acceptable atom inside a character
> class.
>
> The last two ones are well specified by main part modified with annex b.
> The second pattern starts with an empty class, which is a valid way to not
> match anything. And the most relevant rule in Annex B grammar in
> https://tc39.github.io/ecma262/#sec-regular-expressions-patterns is:
>
>     ExtendedPatternCharacter::
>         SourceCharacter  but not one of  ^  $  .  *  +  ?  (  )  [  |
>
> In particular, `]` and `{` may appear unescaped outside character class
> (with the restriction that `{` is not at the start of a sequence that
> resemble a quantifier, which case is taken care by the
> `InvalidBracedQuantifier` production).
>
> —Claude
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20160605/83d19279/attachment.html>