Capturing groups with a quantifier in look-behind assertions should capture the leftmost substring matched by that group or the rightmost one?
Le 9 janv. 2016 à 09:28, ziyunfei <446240525 at qq.com> a écrit :
$ d8 --harmony-regexp-lookbehind -e '"123".match(/(?<=(.){3})/);print(RegExp.$1)' 1
$ perl -e '"123" =~ /(?<=(.){3})/;print $1' 3
Currently, V8's implementation is storing the leftmost substring in $1(and \1) which surprised me a bit.
This is a consequence of lookbehind being implemented in V8 as traversing the string in reverse order (contrarily to Perl), and of the general rule of returning the last matched substring. I don't think it is worth to complicate the algorithm in order to "correct" that behaviour, because, for me, it is intrinsically ambiguous what match $1
should refer to.
Please note that RegExp.$1 is not part of the spec. The implementation in V8 is done in a way to mirror .Net as much as possible. Ignoring .Captures property that has no equivalent in Javascript, capturing the left-most sub-match inside a lookbehind is what .Net does.
Yang
FWIF RegExp.$1
and others are de-facto standard and removing them would
break the Web and much more.
I'm not sure how these would affect a lookbehind proposal but I these cannot be exclude from the list of possible gotchas.
Best
I'm not even sure why RegExp.$1 is mentioned here. The submatches can be observed just fine as part of the match result. And I don't think it's a "gotcha" if it's reflected in the spec. And it is in the current draft afaict.
Yang
Yeah, sorry for the noise but this part confused me too
Please note that RegExp.$1 is not part of the spec
All good then, .
On Fri, 15 Jan 2016 16:49:13 +0100, Andrea Giammarchi
<andrea.giammarchi at gmail.com> wrote:
FWIF
RegExp.$1
and others are de-facto standard and removing them would break the Web and much more.
Indeed. These are currently "specified" at
javascript.spec.whatwg.org/#regexp.$n
I think it would be good to have these defined in the ES spec proper.
Le 19 janv. 2016 à 08:55, Simon Pieters <simonp at opera.com> a écrit :
On Fri, 15 Jan 2016 16:49:13 +0100, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
FWIF
RegExp.$1
and others are de-facto standard and removing them would break the Web and much more.Indeed. These are currently "specified" at javascript.spec.whatwg.org/#regexp.$n
I think it would be good to have these defined in the ES spec proper.
See: tc39/ecma262#137, tc39/ecma262#137
But that wasn't the object of this thread.
$ d8 --harmony-regexp-lookbehind -e '"123".match(/(?<=(.){3})/);print(RegExp.$1)' 1
$ perl -e '"123" =~ /(?<=(.){3})/;print $1' 3
Currently, V8's implementation is storing the leftmost substring in $1(and \1) which surprised me a bit.
Also note that in .Net, you can get all captured substrings by that capturing group using the .Captures
property msdn.microsoft.com/en-us/library/system.text.regularexpressions.group.captures(v=vs.110).aspx , in this case it would be [3, 2, 1] (the order is from right to left) .
$ d8 --harmony-regexp-lookbehind -e '"123".match(/(?<=(.){3})/);print(RegExp.$1)' 1
$ perl -e '"123" =~ /(?<=(.){3})/;print $1' 3
Currently, V8's implementation is storing the leftmost substring in $1(and \1) which surprised me a bit.
Also note that in .Net, you can get all captured substrings by that capturing group using the
.Captures
property msdn.microsoft.com/en-us/library/system.text.regularexpressions.group.captures(v=vs.110).aspx , in this case it would be [3, 2, 1] (the order is from right to left) .