Default non-capturing regex flag [WAS: how to create strawman proposals?]
On Jun 2, 2011, at 10:46 PM, Kyle Simpson wrote:
I propose a /n flag for regular expressions, which would swap the default capturing/non-capturing behavior between ( ) and (?: ) operators (that is, ( ) would not capture, and (?: ) would capture).
I like it. No worries about the .NET somewhat different flag.
The /n property would reflect on the RegExp object as
Noncapturing == true
.
Lowercase noncapturing, right?
On Jun 2, 2011, at 10:49 PM, Brendan Eich wrote:
On Jun 2, 2011, at 10:46 PM, Kyle Simpson wrote:
I propose a /n flag for regular expressions, which would swap the default capturing/non-capturing behavior between ( ) and (?: ) operators (that is, ( ) would not capture, and (?: ) would capture).
I like it. No worries about the .NET somewhat different flag.
There's no backward compatibility fear, because unknown flags (from the future, so to speak) cause errors:
Error: invalid regular expression flag n Source File: javascript:alert(/hi/n) Line: 1, Column: 10 Source Code: alert(/hi/n)
Error: invalid regular expression flag n Source File: javascript:alert(new%20RegExp("hi",%20"n")) Line: 1
Tested in Firefox 4, copied from Error console.
The /n property would reflect on the RegExp object as
Noncapturing == true
.Lowercase noncapturing, right?
Yeah.
On Fri, Jun 3, 2011 at 1:51 AM, Brendan Eich <brendan at mozilla.com> wrote:
On Jun 2, 2011, at 10:49 PM, Brendan Eich wrote:
On Jun 2, 2011, at 10:46 PM, Kyle Simpson wrote:
I propose a /n flag for regular expressions, which would swap the default capturing/non-capturing behavior between ( ) and (?: ) operators (that is, ( ) would not capture, and (?: ) would capture).
I like it. No worries about the .NET somewhat different flag.
There's no backward compatibility fear, because unknown flags (from the future, so to speak) cause errors:
Error: invalid regular expression flag n Source File: javascript:alert(/hi/n) Line: 1, Column: 10 Source Code: alert(/hi/n)
Error: invalid regular expression flag n Source File: javascript:alert(new%20RegExp("hi",%20"n")) Line: 1
Tested in Firefox 4, copied from Error console.
Chrome (13) and Safari (5) tolerate "n". No error.
Opera 11.10 throws for "n" but not so for all the flags. It's silent for "x" (but it supports "x" so that's understandable) and "y" (support for which was dropped earlier, but silent compilation seems to remain).
On Jun 3, 2011, at 10:49 AM, Juriy Zaytsev wrote:
Chrome (13) and Safari (5) tolerate "n". No error.
Bugs filed?
On Sun, Jun 5, 2011 at 5:30 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Jun 3, 2011, at 10:49 AM, Juriy Zaytsev wrote:
Chrome (13) and Safari (5) tolerate "n". No error.
Bugs filed?
WebKit bug — bugs.webkit.org/show_bug.cgi?id=41614
Seems to be stalled. cc'ing Oliver.
2011/6/3 Kyle Simpson <getify at gmail.com>:
I propose a /n flag for regular expressions, which would swap the default capturing/non-capturing behavior between ( ) and (?: ) operators (that is, ( ) would not capture, and (?: ) would capture).
The /n property would reflect on the RegExp object as
Noncapturing == true
.
Can RegExp flag experimentation be done in library code?
function ExtRegExp(regexp, flags) { if ("string" !== typeof regexp) { // Convert parse tree form back to string } return RegExp(regexp, flags); }
ExtRegExp.parse = function (regexp) { // Converts "^(?:foo|bar)$" to ["", ["^"], ["(?:)", ["|", "foo", "bar"]], ["$"]] };
ExtRegExp.n = function (regexp) { // Converts "^(?:foo|bar)$" to ["", ["^"], ["(?:)", ["|", "foo", "bar"]], ["$"]] if ("string" === typeof regexp) { regexp = ExtRegExp.parse(regexp); } // Walk parse tree swapping "(?:)" nodes to capturing groups and vice-versa. ... return regexp; };
ExtRegExp.x = function (regexp) { if ("string" === typeof regexp) { regexp = ExtRegExp.parse(regexp); } // Walk parse tree eliminating whitespace. ... return regexp; };
// Use of n and x flags.
var myRegexp = new ExtRegExp( ExtRegExp.n( ExtRegExp.x( "regexp-source-here")), "i");
Escapes are a pain, due to the double-backslash burden.
We really want quasis for this kind of extensibility. Quasis solve the multiline problem too (I hope... :-).
2011/6/6 Brendan Eich <brendan at mozilla.com>:
Escapes are a pain, due to the double-backslash burden.
Yep. To fit into this kind of library, you could use a quasi syntax like
regexp`...`.ignoreSpaces().nonCapturingByDefault().build()
Not as pithy as flags, but extensible via
regexp.prototype.nonCapturingByDefault = function (parseTree) { ... return parseTree; };
and you can make the build() bit optional in a lot of cases by overriding toString, since String.prototype.replace and friends do implicit value -> string ->
RegExp unless value is a RegExp.
We really want quasis for this kind of extensibility. Quasis solve the multiline problem too (I hope... :-).
Off-topic, quasis do solve the multiline problem as written.
Actually this is fixed in ToT WebKit, have closed the stale bug.
I'm seeing this for the first time now. Sorry for reviving old news.
On 2011-06-03, Brendan Eich wrote:
Kyle Simpson wrote:
I propose a /n flag for regular expressions, which would swap the default capturing/non-capturing behavior between ( ) and (?: ) operators (that is, ( ) would not capture, and (?: ) would capture).
I like it. [...] As with all things RegExp, I wonder what Steve thinks.
I appreciate the vote of confidence! I consider /n to be a medium-strength nice-to-have. In fact, I added it myself in XRegExp v2.0.0-beta. 1
Concerns:
Kyle called this the noncapturing flag and suggested RegExp.prototype.noncapturing. .NET calls the (?n) flag ExplicitCapture and does not let (?: ) capture. The reason for this is suggested by the name "explicit capture"--with /n, only explicitly named capturing groups of the form (?<name> ) capture a value. IMHO, this is the better way to go, but of
course it's dependent on supporting named capture in the first place (as XRegExp does).
Also IMHO, it is better to shelve /n until named capture is added.
Mike Samuel wrote:
Can RegExp flag experimentation be done in library code?
See 1.
--Steven Levithan
Footnote: Oniguruma's /g makes (...) noncapturing. It does not make (?:...) capture. When named capture is is used, (...) is automatically made noncapturing and numbered backreferences are disallowed, unless /G is used to turn (...) back into capturing groups.
-- Steven Levithan
-----Original Message---
I propose a /n flag for regular expressions, which would swap the default capturing/non-capturing behavior between ( ) and (?: ) operators (that is, ( ) would not capture, and (?: ) would capture).
The /n property would reflect on the RegExp object as
Noncapturing == true
.