Anne van Kesteren (2013-09-06T11:16:33.000Z)
domenic at domenicdenicola.com (2013-09-09T02:08:14.040Z)
On Thu, Sep 5, 2013 at 8:07 PM, Norbert Lindenberg <ecmascript at lindenbergsoftware.com> wrote: > ... they're not meant to interpret the code points beyond that, and some processing > (such as test cases) may depend on them being preserved. Since when are test cases a use case? And why can't a test case use the more difficult route? I think ideally a string in a language can only represent Unicode scalar values. I.e. it perfectly maps back and forth to utf-8 (or indeed utf-16, although people shouldn't use that). In ECMAScript a string is 16-bit code units with some sort of utf-16 layer on top in various scenarios. Now we're adding another layer on top of strings, but instead of exposing ideal strings (Unicode scalar values) we go with some kludge to serve edge cases (whose scenarios have not been fully explained thus far) that are better served using the "low-level" 16-bit code unit API. In https://mail.mozilla.org/pipermail/es-discuss/2012-December/027109.html you suggest this is a policy matter, but I do not think it is at all. Unicode scalar values are the code points of Unicode that can be represented in any environment, this is not true for Unicode code points. This is not about policy at all, but rather about what a string ought to be. > Adding code point indexing to 16-bit code unit strings would add significant performance overhead. Agreed. I don't think we need the *At method for now. Use the iterator.