Claude Pache (2014-05-05T12:12:04.000Z)
Le 5 mai 2014 à 12:03, Mathias Bynens <mathias at qiwi.be> a écrit :

> On 5 May 2014, at 10:48, Claude Pache <claude.pache at gmail.com> wrote:
> 
>> In my view, if `atob` and `btoa` were to enter in ES, it should be in Appendix B (the deprecated legacy features of web browsers), where it would be in good company with the other utility that does an implicit confusion between binary and ISO-8859-1-encoded strings, namely `escape/unescape`.
> 
> How do `atob` and `btoa` do any sort of implicit conversion between binary and any other encoding? Their behavior is well-defined, and they’re explicitly limited to extended ASCII.

Here, your "extended ASCII" means more precisely "ISO-8859-1".

Base64 is defined as a binary-to-text encoding [1]. In the definition of `btoa` and `atob` [2], "binary strings" (which does not exist natively in JS) are replaced by ES strings of code units between 0x0000 and 0x00FF. That is equivalent to interpret the binary string as an ISO-8859-1-encoded string, because U+0000 to U+00FF code points correspond exactly to the ISO-8859-1 code points.

As you know, on the web, it is nowadays more fashionable to use UTF-8 rather than ISO-8859-1. Moreover, there are probably applications that want a raw binary string instead of interpreting it via some character encoding. In both cases, `atob` and `btoa` are unsatisfactory.

> 
> I don’t think this is Annex B material regardless — this is not a legacy feature.
> 
>> We should be able to define a better designed function (and with a less silly name, while we're at it).
> 
> That would kind of defeat the purpose IMHO. We’re stuck with `atob`/`btoa` anyway in browsers — adding yet another name for the same thing does not really help.

I meant, defining a better thing, not the same thing (somewhat like `encodeURIComponent` is not the same thing as `escape`).

—Claude

[1] https://en.wikipedia.org/wiki/Base64
[2] http://whatwg.org/html/webappapis.html#atob
domenic at domenicdenicola.com (2014-05-08T18:20:54.206Z)
Le 5 mai 2014 à 12:03, Mathias Bynens <mathias at qiwi.be> a écrit :

> How do `atob` and `btoa` do any sort of implicit conversion between binary and any other encoding? Their behavior is well-defined, and they’re explicitly limited to extended ASCII.

Here, your "extended ASCII" means more precisely "ISO-8859-1".

Base64 is defined as a binary-to-text encoding [1]. In the definition of `btoa` and `atob` [2], "binary strings" (which does not exist natively in JS) are replaced by ES strings of code units between 0x0000 and 0x00FF. That is equivalent to interpret the binary string as an ISO-8859-1-encoded string, because U+0000 to U+00FF code points correspond exactly to the ISO-8859-1 code points.

As you know, on the web, it is nowadays more fashionable to use UTF-8 rather than ISO-8859-1. Moreover, there are probably applications that want a raw binary string instead of interpreting it via some character encoding. In both cases, `atob` and `btoa` are unsatisfactory.

> That would kind of defeat the purpose IMHO. We’re stuck with `atob`/`btoa` anyway in browsers — adding yet another name for the same thing does not really help.

I meant, defining a better thing, not the same thing (somewhat like `encodeURIComponent` is not the same thing as `escape`).

[1]: https://en.wikipedia.org/wiki/Base64
[2]: http://whatwg.org/html/webappapis.html#atob