RegExp `x` flag
I would personally love this (as well as interpolations in regexp literals). I do have a concern about whether removing the newline restriction creates ambiguities with division, but I suspect this is not the case.
Isiah Meadows contact at isiahmeadows.com, www.isiahmeadows.com
Let me clarify that previous message: I mean "newline restriction" in
the sense that newlines are not permitted in regexp literals. A /x
flag would make removing it practically required for it to have any
utility.
Isiah Meadows contact at isiahmeadows.com, www.isiahmeadows.com
Even if this flag were restricted to constructors instead of both constructors and literals, it could be worthwhile.
- is this minifier-friendly?
- is parsing-impact minimal enough to not affect load-times? regexp-detection/bounding is among the most expensive/complex part of javascript-parsing.
those 2 nits aside, i'm on the fence. regexp-spaghetti is a valid painpoint, and jslint's author has expressed desire for multiline regexp [1]. otoh, there is a good-enough solution by falling-back to constructor-form to improve readability:
// real-world spaghetti-regexp code in jslint.js
const rx_token =
/^((\s+)|([a-zA-Z_$][a-zA-Z0-9_$]*)|[(){}\[\],:;'"~`]|\?\.?|=(?:==?|>)?|\.+|[*\/][*\/=]?|\+[=+]?|-[=\-]?|[\^%]=?|&[&=]?|\|[|=]?|>{1,3}=?|<<?=?|!(?:!|==?)?|(0|[1-9][0-9]*))(.*)$/;
// vs
/*
* break JSON.stringify(rx_token.source)
* into multiline constructor-form for readability
*/
const rx_token = new RegExp(
"^("
+ "(\\s+)"
+ "|([a-zA-Z_$][a-zA-Z0-9_$]*)"
+ "|[(){}\\[\\],:;'\"~`]"
+ "|\\?\\.?"
+ "|=(?:==?|>)?"
+ "|\\.+"
+ "|[*\\/][*\\/=]?"
+ "|\\+[=+]?"
+ "|-[=\\-]?"
+ "|[\\^%]=?"
+ "|&[&=]?"
+ "|\\|[|=]?"
+ "|>{1,3}=?"
+ "|<<?=?"
+ "|!(?:!|==?)?"
+ "|(0|[1-9][0-9]*)"
+ ")(.*)$"
);
[1] github jslint-issue #231 - ignore long regexp's (and comments) douglascrockford/JSLint#231
With to a minifier, I see no reason it couldn't compact it to the regex we have today. After all, the only changes are the addition of whitespace and comments.
I can't speak as to parse time, though in production that would large be removed by the aforementioned minification.
String concatenation certainly works, but then any escapes have to be
doubly so, else you use String.raw
on template literals in every
situation, quickly cluttering things back up.
- Very. Just strip the extra whitespace and replace it with the
non-
/x
version. - Whitespace is negligible in parsing performance, and regexps have a
fairly simple grammar to begin with. (It can be done with a single
character of lookahead easily and the only thing that can nest more
than a single level is parentheses.) 90% of the actual time spent on
them is on compilation and
/x
would have zero effect on that.
The issue of detection is actually pretty trivial: a /
is assumed to
be division any time you can continue an expression, and regexps are
only consumed when no binary operator could potentially be expected.
It's a rather obscure edge case often left out of ASI posts, one I've
yet to even hear about being used, although I could contemplate it
being used in code bases which use cond && foo()
instead of if (cond) foo()
and cond || foo()
instead of if (!cond) foo()
.
new RegExp(multilineString)
is a valid fallback, something I
already use today quite a bit, but I'd prefer to use one or the other
consistently for static regexps.
Isiah Meadows contact at isiahmeadows.com, www.isiahmeadows.com
Has there been any previous discussion of adding the
x
flag to JS? It exists in other languages, and can make relatively complicated regex much easier to read. It also allows for comments, which are incredibly helpful when trying to understand some regexes.For prior art, XRegExp has this flag (though I've no idea to figure out how frequently it's used), as do a few other languages.
Quick overview: www.regular-expressions.info/freespacing.html
Language references: Python: docs.python.org/3/library/re.html#re.X Rust: docs.rs/regex/1.1.6/regex XRegExp: xregexp.com/xregexp/flags/#extended .NET: docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference#regular-expression-options
Jacob Pratt