Nozomu KatÅ (2015-10-09T13:00:55.000Z)
d at domenic.me (2015-10-12T20:38:36.925Z)
Me too; I have once implemented lookbehind assertions by using this way in SRELL, my C++ template library whose engine is compatible with RegExp of ECMAScript but whose class design is compatible with std::regex of C++ [1]. However, later I removed the code for such lookbehinds and adopted Perl5 style lookbehinds instead. The core reasons are: 1. Right-to-left matchers are used only in lookbehind assertions; 2. Nevertheless, these cannot share code with normal (left-to-right) matchers and need their own optimization processes. Thus, I came to feel that what I can get and what I have to do are unbalanced. In my understanding, features that are available in .NET style lookbehinds but are not so and even cannot be emulated in Perl5 style lookbehinds are 1) the use of the backreference and 2) the use of the quantifiers other than {n}. The others can be emulated in some way. For example, the positive multiple-length lookbehind (?<=ab|cde) can be substituted by (?:(?<=ab)|(?<=cde)). The substitution of the negative multiple-length lookbehind is more simple, only to write assertions in succession; for example, (?<!ab|cde) can be written as (?<!ab)(?<!cde). I guess that oniguruma supports expressions like (?<=ab|cde) by doing such substitutions inside the library, but just my guess. So, I came to feel that Perl5 style lookbehinds are balanced. It may not be best, though. In fact, the current implementation for lookbehinds in my library is far simple; it shares code with lookaheads. If the count to rewind is 0 then it means lookahead, otherwise (if equal to or more than 1) it means lookbehind. If we would introduce .NET style lookbehinds into RegExp of ECMAScript, it would need someone who writes right-to-left versions of the most parts of the definitions under 21.2 of the specification. Nozomu [1] http://www.akenotsuki.com/misc/srell/en/