Specify exactly how RegExp.source should be escaped

# Simon Pieters (13 years ago)

The spec says about RegExp.source:

[[ The characters / or backslash \ occurring in the pattern shall be escaped in S as necessary to ensure that the String value formed by concatenating the Strings "/", S, "/", and F can be parsed (in an appropriate lexical context) as a RegularExpressionLiteral that behaves identically to the constructed regular expression. For example, if P is "/", then S could be "/" or "\u002F", among other possibilities, but not "/", because /// followed by F would be parsed as a SingleLineComment rather than a RegularExpressionLiteral. If P is the empty String, this specification can be met by letting S be "(?:)".

...

The source property of the newly constructed object is set to S. ]] es5.github.com/#x15.10.4.1

Why is the requirement so vague? I would like the spec to state exactly how source is to be escaped, maybe with an algorithm like:

  1. If S is the empty string, let S be "(?:)".
  2. Replace all instances of "/" in S with "/".
  3. Replace all instances of literal new lines in S with ???
  4. ???

Currently, I have no idea what to check for when writing test cases for the .source property when testing e.g. empty string or a slash as P.

# Allen Wirfs-Brock (13 years ago)

On Mar 19, 2013, at 8:05 AM, Simon Pieters wrote:

Hi

The spec says about RegExp.source:

[[ The characters / or backslash \ occurring in the pattern shall be escaped in S as necessary to ensure that the String value formed by concatenating the Strings "/", S, "/", and F can be parsed (in an appropriate lexical context) as a RegularExpressionLiteral that behaves identically to the constructed regular expression. For example, if P is "/", then S could be "/" or "\u002F", among other possibilities, but not "/", because /// followed by F would be parsed as a SingleLineComment rather than a RegularExpressionLiteral. If P is the empty String, this specification can be met by letting S be "(?:)".

...

The source property of the newly constructed object is set to S. ]] es5.github.com/#x15.10.4.1

Why is the requirement so vague? I would like the spec to state exactly how source is to be escaped, maybe with an algorithm like:

Prior to ES5, the escaping wasn't even specified and the spec. simply said that the pattern was implementation defined. We could probably specify it, but somebody would need to develop a proposal that completely specifies the required escaping.

  1. If S is the empty string, let S be "(?:)".
  2. Replace all instances of "/" in S with "/".
  3. Replace all instances of literal new lines in S with ???
  4. ???

Currently, I have no idea what to check for when writing test cases for the .source property when testing e.g. empty string or a slash as P.

The key requirement in the specification is that a RegExp created from the .source property string behaves identically to the RegExp created from the original pattern. I would test this by developing a set of interesting patterns that require escaping and then testings that a RegExp created from their .source properties produces identical result to the original patterns.