Bjoern Hoehrmann (2013-10-19T18:27:32.000Z)
domenic at domenicdenicola.com (2013-10-26T03:05:31.449Z)
Mathias Bynens wrote: > Are you saying that changing the name to something that is longer than `at` would solve this problem? If it was `.getOneOrTwoCodepointLongSubstringAtUcs2CodeUnitIndex(...)` I am sure people would be reluctant using it because it's unreasonably long compared to `String.fromCodePoint(str.codePointAt(p))` and harder to understand than the combination of those two primitives. > People are using `String.prototype.charAt()` incorrectly too, expecting > it to return whole symbols instead of surrogate halves wherever possible. > How would _not_ introducing a method that avoids this problem help? Right now people do not have much of a choice other than writing code that does not do the right thing when faced with malformed strings or non-BMP characters, it's unreasonable to call a method like `substr` and then manually smooth it up around the edges and perhaps scan the interior for lone surrogates to ensure that at least your code doesn't do the wrong thing. That gives you "well-known bad" code, which is a good thing to have, better than more complicated code that might have unknown bugs. Allen's loop `for (let p=0; p<str.length; p+=c.length)` for instance is just waiting for someone to improve or replace it with code that increments by `1` instead of `.length` because that's simpler. The methods `fromCodePoint` and `codePointAt` can be used to get ugly constants out of code that tries to do the right thing, and they will offer some insight into how developers might go from UCS-only code to something more proper, but for the moment duplicating all the UCS-based methods strikes me as premature, especially when giving them seductive names. How would a somewhat-surrogate-aware `substring` method work and what would it be called, for instance? If it is omitted, we would be back to square one, someone in need of substring functionality has to jump through overly complicated hoops to make it work "correctly" and ends up mixing surrogate-pair-aware with -unaware code.