d at domenic.me (2015-06-16T16:55:30.381Z)
On Sat, Jun 13, 2015 at 9:07 PM, Mark S. Miller <erights at google.com> wrote:
> On Sat, Jun 13, 2015 at 9:17 AM, Domenic Denicola <d at domenic.me> wrote:
>
>> All of these should be building on top of RegExp.escape :P
>>
>
> It's funny how, by considering it as leading to a proposal, I quickly saw
> deep flaws that I was previously missing.
>
>
That was a big part of making a proposal out of it - to find these things :)
> the overall result does not do this. For example:
>
> const data = ':x';
> const rebad = RegExp.tag`(?${data})`;
> console.log(rebad.test('x')); // true
>
> is nonsense. Since the RegExp grammar can be extended per platform, the
> same argument that says we should have the platform provide RegExp.escape
> says we should have the platform provide RegExp.tag -- so that they can
> conisistently reflect these platform extensions.
>
>
This is a good point, I considered whether or not `-` should be included
for a similar reason. I think it is reasonable to only include syntax
identifiers and expect users to deal with parts of patterns of more than
one characters themselves (by wrapping the string with `()` in the
constructor). This is what every other language does practically.
That said - I'm very open to allowing implementations to escape _more_ than
`SyntaxCharacter` in their implementations and to even recommend that they
do so in such a way that is consistent with their regular expressions. What
do you think about doing that?
I'm also open to `.tag` wrapping with `()` to avoid these issues but I'm
not sure if we have a way in JavaScript to not make a capturing group out
of it.
> * Now that we have modules, I would like to see us stop having each
> proposal for new functionality come at the price of further global
> namespace pollution. I would like to see us transition towards having most
> new std library entry points be provided by std modules. I understand why
> we haven't yet, but something needs to go first.
>
>
I think that doing this should be an eventual target but I don't think
adding a single much-asked-for static function to the RegExp function would
be a good place to start. I think the committee first needs to agree about
how this form of modularisation should be done - there are much bigger
targets first and I would not like to see this proposal tied and held back
by that (useful) goal.
> * ES6 made RegExp subclassable with most methods delegating to a common
> @exec method, so that a subclass only needs to consistently override a
> small number of things to stay consistent. Neither RegExpSubclass.escape
> nor RegExpSubclass.tag can be derived from aRegExpSubclass[@exec]. Because
> of the first bullet, RegExpSubclass.tag also cannot be derived from
> RegExpSubclass.escape. But having RegExpSubclass.escape delegating to
> RegExpSubclass.tag seem weird.
>
>
Right but it makes sense that `escape` does not play in this game since it
is a static method that takes a string argument - I'm not sure how it could
use @exec.
> * The instanceof below prevents this polyfill from working cross-frame.
> Also, when doing RegExpSubclass1.tag`xx${aRegExpSubclass2}yy`, where
> RegExpSubclass2.source produces a regexp grammar string that
> RegExpSubclass1 does not understand, I have no idea what the composition
> should do other than reject with an error. But what if the strings happen
> to be mutually valid but with conflicting meaning between these subclasses?
This is hacky, but in my code I just did `argument.exec ? treatAsRegExp : treatAsString`.
On Sat, Jun 13, 2015 at 9:07 PM, Mark S. Miller <erights at google.com> wrote: > On Sat, Jun 13, 2015 at 9:17 AM, Domenic Denicola <d at domenic.me> wrote: > >> All of these should be building on top of RegExp.escape :P >> > > It's funny how, by considering it as leading to a proposal, I quickly saw > deep flaws that I was previously missing. > > That was a big part of making a proposal out of it - to find these things :) > the overall result does not do this. For example: > > const data = ':x'; > const rebad = RegExp.tag`(?${data})`; > console.log(rebad.test('x')); // true > > is nonsense. Since the RegExp grammar can be extended per platform, the > same argument that says we should have the platform provide RegExp.escape > says we should have the platform provide RegExp.tag -- so that they can > conisistently reflect these platform extensions. > > This is a good point, I considered whether or not `-` should be included for a similar reason. I think it is reasonable to only include syntax identifiers and expect users to deal with parts of patterns of more than one characters themselves (by wrapping the string with `()` in the constructor). This is what every other language does practically. That said - I'm very open to allowing implementations to escape _more_ than `SyntaxCharacter` in their implementations and to even recommend that they do so in such a way that is consistent with their regular expressions. What do you think about doing that? I'm also open to `.tag` wrapping with `()` to avoid these issues but I'm not sure if we have a way in JavaScript to not make a capturing group out of it. > * Now that we have modules, I would like to see us stop having each > proposal for new functionality come at the price of further global > namespace pollution. I would like to see us transition towards having most > new std library entry points be provided by std modules. I understand why > we haven't yet, but something needs to go first. > > I think that doing this should be an eventual target but I don't think adding a single much-asked-for static function to the RegExp function would be a good place to start. I think the committee first needs to agree about how this form of modularisation should be done - there are much bigger targets first and I would not like to see this proposal tied and held back by that (useful) goal. > * ES6 made RegExp subclassable with most methods delegating to a common > @exec method, so that a subclass only needs to consistently override a > small number of things to stay consistent. Neither RegExpSubclass.escape > nor RegExpSubclass.tag can be derived from aRegExpSubclass[@exec]. Because > of the first bullet, RegExpSubclass.tag also cannot be derived from > RegExpSubclass.escape. But having RegExpSubclass.escape delegating to > RegExpSubclass.tag seem weird. > > Right but it makes sense that `escape` does not play in this game since it is a static method that takes a string argument - I'm not sure how it could use @exec. > * The instanceof below prevents this polyfill from working cross-frame. > Also, when doing RegExpSubclass1.tag`xx${aRegExpSubclass2}yy`, where > RegExpSubclass2.source produces a regexp grammar string that > RegExpSubclass1 does not understand, I have no idea what the composition > should do other than reject with an error. But what if the strings happen > to be mutually valid but with conflicting meaning between these subclasses? > > This is hacky, but in my code I just did `argument.exec ? treatAsRegExp : treatAsString`. > > > >> >> >> *From:* es-discuss [mailto:es-discuss-bounces at mozilla.org] *On Behalf Of >> *Mark S. Miller >> *Sent:* Saturday, June 13, 2015 02:39 >> *To:* C. Scott Ananian >> *Cc:* Benjamin Gruenbaum; es-discuss >> *Subject:* Re: RegExp.escape() >> >> >> >> The point of this last variant is that data gets escaped but RegExp >> objects do not -- allowing you to compose RegExps: >> re`${re1}|${re2}*|${data}` >> But this requires one more adjustment: >> >> >> > >> > function re(first, ...args) { >> > let flags = first; >> > function tag(template, ...subs) { >> > const parts = []; >> > const numSubs = subs.length; >> > for (let i = 0; i < numSubs; i++) { >> > parts.push(template.raw[i]); >> > const subst = subs[i] instanceof RegExp ? >> >> >> `(?:${subs[i].source})` : >> >> > subs[i].replace(/[\/\\^$*+?.()|[\]{}]/g, '\\amp;'); >> > parts.push(subst); >> > } >> > parts.push(template.raw[numSubs]); >> > return RegExp(parts.join(''), flags); >> > } >> > if (typeof first === 'string') { >> > return tag; >> > } else { >> > flags = void 0; // Should this be '' ? >> > return tag(first, ...args); >> > } >> > } >> > > > > -- > Cheers, > --MarkM > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150613/9e6fdabf/attachment.html>