ES Discuss - Message History

Cyril Auburtin (2018-05-19T08:08:39.000Z)

Go to Source

You can also have a

```js
var replacer = replacements => {
  const re = new RegExp(replacements.map(([k,_,escaped=k]) =>
escaped).join('|'), 'gu');
  const replaceMap = new Map(replacements);
  return s => s.replace(re, w => replaceMap.get(w));
}
var replace = replacer([['$', '^', String.raw`\$`], ['1', '2'], ['<',
'<'], ['🍌', '🍑'], ['-', '_'], [']', '@', String.raw`\]`]]);
replace('test🍐🍌-$$[11] <foo>') // "test🍐🍑_^^[22@ <foo>"
```
but it's quickly messy to work with escaping

Le sam. 19 mai 2018 à 08:17, Isiah Meadows <isiahmeadows at gmail.com> a
écrit :

> Here's what I'd prefer instead: overload `String.prototype.replace` to
> take non-callable objects, as sugar for this:
>
> ```js
> const old = Function.call.bind(Function.call, String.prototype.replace)
> String.prototype.replace = function (regexp, object) {
>     if (object == null && regexp != null && typeof regexp === "object") {
>         const re = new RegExp(
>             Object.keys(regexp)
>             .map(key => `${old(key, /[\\^$*+?.()|[\]{}]/g, '\\$&')}`)
>             .join("|")
>         )
>         return old(this, re, m => object[m])
>     } else {
>         return old(this, regexp, object)
>     }
> }
> ```
>
> This would cover about 99% of my use for something like this, with
> less runtime overhead (that of not needing to check for and
> potentially match multiple regular expressions at runtime) and better
> static analyzability (you only need to check it's an object literal or
> constant frozen object, not that it's argument is the result of the
> built-in `Map` call). It's exceptionally difficult to optimize for
> this unless you know everything's a string, but most cases where I had
> to pass a callback that wasn't super complex looked a lot like this:
>
> ```js
> // What I use:
> function escapeHTML(str) {
>     return str.replace(/["'&<>]/g, m => {
>         switch (m) {
>         case '"': return """
>         case "'": return "'"
>         case "&": return "&"
>         case "<": return "<"
>         case ">": return ">"
>         default: throw new TypeError("unreachable")
>         }
>     })
> }
>
> // What it could be
> function escapeHTML(str) {
>     return str.replace({
>         '"': """,
>         "'": "'",
>         "&": "&",
>         "<": "<",
>         ">": ">",
>     })
> }
> ```
>
> And yes, this enables optimizations engines couldn't easily produce
> otherwise. In this instance, an engine could find that the object is
> static with only single-character entries, and it could replace the
> call to a fast-path one that relies on a cheap lookup table instead
> (Unicode replacement would be similar, except you'd need an extra
> layer of indirection with astrals to avoid blowing up memory when
> generating these tables):
>
> ```js
> // Original
> function escapeHTML(str) {
>     return str.replace({
>         '"': """,
>         "'": "'",
>         "&": "&",
>         "<": "<",
>         ">": ">",
>     })
> }
>
> // Not real JS, but think of it as how an engine might implement this. The
> // implementation of the runtime function `ReplaceWithLookupTable` is
> omitted
> // for brevity, but you could imagine how it could be implemented, given
> the
> // pseudo-TS signature:
> //
> // ```ts
> // declare function %ReplaceWithLookupTable(
> //     str: string,
> //     table: string[]
> // ): string
> // ```
> function escapeHTML(str) {
>     static {
>         // A zero-initialized array with 2^16 entries (U+0000-U+FFFF),
> except
>         // for the object's members. This takes up to about 70K per
> instance,
>         // but these are *far* more often called than created.
>         const _lookup_escapeHTML = %calloc(65536)
>
>         _lookup_escapeHTML[34] = """
>         _lookup_escapeHTML[38] = "&"
>         _lookup_escapeHTML[39] = "'"
>         _lookup_escapeHTML[60] = ">"
>         _lookup_escapeHTML[62] = "<"
>     }
>
>     return %ReplaceWithLookupTable(str, _lookup_escapeHTML)
> }
> ```
>
> Likewise, similar, but more restrained, optimizations could be
> performed on objects with multibyte strings, since they can be reduced
> to a simple search trie. (These can be built in even the general case
> if the strings are large enough to merit it - small ropes are pretty
> cheap to create.)
>
> For what it's worth, there's precedent here in Ruby, which has support
> for `Hash`es as `String#gsub` parameters which work similarly.
>
> -----
>
> Isiah Meadows
> me at isiahmeadows.com
> www.isiahmeadows.com
>
>
> On Fri, May 18, 2018 at 1:01 PM, Logan Smyth <loganfsmyth at gmail.com>
> wrote:
> >> It wouldn't necessarily break existing API, since
> String.prototype.replace
> >> currently accepts only RegExp or strings.
> >
> > Not quite accurate. It accepts anything with a `Symbol.replace`
> property, or
> > a string.
> >
> > Given that, what you're describing can be implemented as
> > ```
> > Map.prototype[Symbol.replace] = function(str) {
> >   for(const [key, value] of this) {
> >     str = str.replace(key, value);
> >   }
> >   return str;
> > };
> > ```
> >
> >> I don't know if the ECMAScript spec mandates preserving a particular
> order
> >> to a Map's elements.
> >
> > It does, so you're good there.
> >
> >> Detecting collisions between matching regular expressions or strings.
> >
> > I think this would be my primary concern, but no so much ordering as
> > expectations. Like if you did
> > ```
> > "1".replace(new Map([
> >   ['1', '2'],
> >   ['2', '3],
> > ]);
> > ```
> > is the result `2` or `3`? `3` seems surprising to me, at least in the
> > general sense, because there was no `2` in the original input, but it's
> also
> > hard to see how you'd spec the behavior to avoid that if general regex
> > replacement is supported.
> >
> > On Fri, May 18, 2018 at 9:47 AM, Alex Vincent <ajvincent at gmail.com>
> wrote:
> >>
> >> Reading [1] in the digests, I think there might actually be an API
> >> improvement that is doable.
> >>
> >> Suppose the String.prototype.replace API allowed passing in a single
> >> argument, a Map instance where the keys were strings or regular
> expressions
> >> and the values were replacement strings or functions.
> >>
> >> Advantages:
> >> * Shorthand - instead of writing str.replace(a, b).replace(c,
> >> d).replace(e, f)... you get str.replace(regExpMap)
> >> * Reusable - the same regular expression/string map could be used for
> >> several strings (assuming of course the user didn't just abstract the
> call
> >> into a separate function)
> >> * Modifiable on demand - developers could easily add new regular
> >> expression matches to the map object, or remove them
> >> * It wouldn't necessarily break existing API, since
> >> String.prototype.replace currently accepts only RegExp or strings.
> >>
> >> Disadvantages / reasons not to do it:
> >> * Detecting collisions between matching regular expressions or strings.
> >> If two regular expressions match the same string, or a regular
> expression
> >> and a search string match, the expected results may vary because a Map's
> >> elements might not be consistently ordered.  I don't know if the
> ECMAScript
> >> spec mandates preserving a particular order to a Map's elements.
> >>   - if we preserve the same chaining capability
> >> (str.replace(map1).replace(map2)...), this might not be a big problem.
> >>
> >> The question is, how often do people chain replace calls together?
> >>
> >> * It's not particularly hard to chain several replace calls together.
> >> It's just verbose, which might not be a high enough burden to overcome
> for
> >> adding API.
> >>
> >> That's my two cents for the day.  Thoughts?
> >>
> >> [1] https://esdiscuss.org/topic/adding-map-directly-to-string-prototype
> >>
> >> --
> >> "The first step in confirming there is a bug in someone else's work is
> >> confirming there are no bugs in your own."
> >> -- Alexander J. Vincent, June 30, 2001
> >>
> >> _______________________________________________
> >> es-discuss mailing list
> >> es-discuss at mozilla.org
> >> https://mail.mozilla.org/listinfo/es-discuss
> >>
> >
> >
> > _______________________________________________
> > es-discuss mailing list
> > es-discuss at mozilla.org
> > https://mail.mozilla.org/listinfo/es-discuss
> >
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180519/895f71f7/attachment-0001.html>

cyril.auburtin at gmail.com (2018-05-19T08:29:40.814Z)

You can also have a

```js
var replacer = replacements => {
  const re = new RegExp(replacements.map(([k,_,escaped=k]) => escaped).join('|'), 'gu');
  const replaceMap = new Map(replacements);
  return s => s.replace(re, w => replaceMap.get(w));
}
var replace = replacer([
  ['$', '^', String.raw`\$`],
  ['1', '2'],
  ['<', '&lt;'], 
  ['🍌', '🍑'],
  ['-', '_'],
  [']', '@', String.raw`\]`]
]);
replace('test🍐🍌-$$[11] <foo>') // "test🍐🍑_^^[22@ &lt;foo>"
```
but it's quickly messy to work with escaping

Le sam. 19 mai 2018 à 08:17, Isiah Meadows <isiahmeadows at gmail.com> a
écrit :

cyril.auburtin at gmail.com (2018-05-19T08:28:59.938Z)

You can also have a

```js
var replacer = replacements => {
  const re = new RegExp(replacements.map(([k,_,escaped=k]) => escaped).join('|'), 'gu');
  const replaceMap = new Map(replacements);
  return s => s.replace(re, w => replaceMap.get(w));
}
var replace = replacer([
  ['$', '^', String.raw`\$`],
  ['1', '2'],
  ['<', '<'], 
  ['🍌', '🍑'],
  ['-', '_'],
  [']', '@', String.raw`\]`]
]);
replace('test🍐🍌-$$[11] <foo>') // "test🍐🍑_^^[22@ <foo>"
```
but it's quickly messy to work with escaping

Le sam. 19 mai 2018 à 08:17, Isiah Meadows <isiahmeadows at gmail.com> a
écrit :

Edit