proposal: DataView.prototype.toString(start = 0, end = this.length, enc= "utf8")

# Ali Rahbari (7 years ago)

When working with binary data, reading string from buffer is a necessity. For example when deserializing bson data in the browser.

Most of node.js buffer methods are available in DataView except toString. As a workaround currently it's done by reading the buffer byte by byte and convert each of them to corresponding character with String.fromCharCode() and then joining the result.

This can be done much faster in native code.

DataView.prototype.toString(start = 0, end = this.length, encoding= "utf8")

DataView.prototype.writeString(string, offset = 0, length = string.length, encoding = "utf8")

When working with binary data, reading string from buffer is a necessity.
For example when deserializing bson data in the browser.

Most of node.js buffer methods are available in DataView except toString.
As a workaround currently it's done by reading the buffer byte by byte and
convert each of them to corresponding character with String.fromCharCode()
and then joining the result.

This can be done much faster in native code.

DataView.prototype.toString(start = 0, end = this.length, encoding= "utf8")

DataView.prototype.writeString(string, offset = 0, length = string.length,
encoding = "utf8")

-- 
Ali Rahbari
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180822/0b4fb666/attachment.html>

# Isiah Meadows (7 years ago)

You might be interested in some of the things I have here:

isiahmeadows/array-additions-proposal/blob/master/README.md

I'm not quite to the point of pushing TC39 to consider, but after I get around to sorting it better, I'd love to see all of them eventually implemented.

Note that for some common cases, like to ASCII or UTF-8 when no character in the string is above U+007F or to UCS-2/UTF-16 when at least one character is above that, it's a trivial memcpy for multi-character strings.

You might be interested in some of the things I have here:

https://github.com/isiahmeadows/array-additions-proposal/blob/master/README.md

I'm not quite to the point of pushing TC39 to consider, but after I get
around to sorting it better, I'd love to see all of them eventually
implemented.

Note that for some common cases, like to ASCII or UTF-8 when no character
in the string is above U+007F or to UCS-2/UTF-16 when at least one
character is above that, it's a trivial `memcpy` for multi-character
strings.
On Tue, Aug 21, 2018 at 18:32 Ali Rahbari <rahbari at gmail.com> wrote:

> When working with binary data, reading string from buffer is a necessity.
> For example when deserializing bson data in the browser.
>
> Most of node.js buffer methods are available in DataView except toString.
> As a workaround currently it's done by reading the buffer byte by byte and
> convert each of them to corresponding character with String.fromCharCode()
> and then joining the result.
>
> This can be done much faster in native code.
>
> DataView.prototype.toString(start = 0, end = this.length, encoding= "utf8")
>
> DataView.prototype.writeString(string, offset = 0, length = string.length,
> encoding = "utf8")
>
> --
> Ali Rahbari
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180821/357fcdab/attachment.html>

# Ken Russell (7 years ago)

If DataView's methods were faster, would this eliminate the need for this addition?

In the V8 engine, some recent work significantly improved DataView's performance. See crbug.com/225811 for details.

If DataView's methods were faster, would this eliminate the need for this
addition?

In the V8 engine, some recent work significantly improved DataView's
performance. See http://crbug.com/225811 for details.

-Ken


On Tue, Aug 21, 2018 at 7:02 PM Isiah Meadows <isiahmeadows at gmail.com>
wrote:

> You might be interested in some of the things I have here:
>
>
> https://github.com/isiahmeadows/array-additions-proposal/blob/master/README.md
>
> I'm not quite to the point of pushing TC39 to consider, but after I get
> around to sorting it better, I'd love to see all of them eventually
> implemented.
>
> Note that for some common cases, like to ASCII or UTF-8 when no character
> in the string is above U+007F or to UCS-2/UTF-16 when at least one
> character is above that, it's a trivial `memcpy` for multi-character
> strings.
> On Tue, Aug 21, 2018 at 18:32 Ali Rahbari <rahbari at gmail.com> wrote:
>
>> When working with binary data, reading string from buffer is a necessity.
>> For example when deserializing bson data in the browser.
>>
>> Most of node.js buffer methods are available in DataView except toString.
>> As a workaround currently it's done by reading the buffer byte by byte and
>> convert each of them to corresponding character with String.fromCharCode()
>> and then joining the result.
>>
>> This can be done much faster in native code.
>>
>> DataView.prototype.toString(start = 0, end = this.length, encoding=
>> "utf8")
>>
>> DataView.prototype.writeString(string, offset = 0, length =
>> string.length, encoding = "utf8")
>>
>> --
>> Ali Rahbari
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180822/d9879a70/attachment-0001.html>

# Isiah Meadows (7 years ago)

DataView has nothing to do with this performance-wise - implementations could just use the raw array buffer instead, and it'd be quicker and easier.

The issue is string representation - engines can optimize for it in ways you can't at the language level.

DataView has nothing to do with this performance-wise - implementations
could just use the raw array buffer instead, and it'd be quicker and easier.

The issue is string representation - engines can optimize for it in ways
you can't at the language level.

On Wed, Aug 22, 2018 at 16:51 Ken Russell <kbr at google.com> wrote:

> If DataView's methods were faster, would this eliminate the need for this
> addition?
>
> In the V8 engine, some recent work significantly improved DataView's
> performance. See http://crbug.com/225811 for details.
>
> -Ken
>
>
> On Tue, Aug 21, 2018 at 7:02 PM Isiah Meadows <isiahmeadows at gmail.com>
> wrote:
>
>> You might be interested in some of the things I have here:
>>
>>
>> https://github.com/isiahmeadows/array-additions-proposal/blob/master/README.md
>>
>> I'm not quite to the point of pushing TC39 to consider, but after I get
>> around to sorting it better, I'd love to see all of them eventually
>> implemented.
>>
>> Note that for some common cases, like to ASCII or UTF-8 when no character
>> in the string is above U+007F or to UCS-2/UTF-16 when at least one
>> character is above that, it's a trivial `memcpy` for multi-character
>> strings.
>> On Tue, Aug 21, 2018 at 18:32 Ali Rahbari <rahbari at gmail.com> wrote:
>>
>>> When working with binary data, reading string from buffer is a
>>> necessity. For example when deserializing bson data in the browser.
>>>
>>> Most of node.js buffer methods are available in DataView except
>>> toString. As a workaround currently it's done by reading the buffer byte by
>>> byte and convert each of them to corresponding character with
>>> String.fromCharCode() and then joining the result.
>>>
>>> This can be done much faster in native code.
>>>
>>> DataView.prototype.toString(start = 0, end = this.length, encoding=
>>> "utf8")
>>>
>>> DataView.prototype.writeString(string, offset = 0, length =
>>> string.length, encoding = "utf8")
>>>
>>> --
>>> Ali Rahbari
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180822/fbc3dc41/attachment.html>

# Ali Rahbari (7 years ago)

Logan mentioned TextDecoder which exactly do this but ArrayBuffer must first get sliced and then passed to TextDecoder. Still it's nice to have this inside DataView as Isiah proposal suggest.

Logan mentioned TextDecoder which exactly do this but ArrayBuffer must
first get sliced and then passed to TextDecoder.
Still it's nice to have this inside DataView as Isiah proposal suggest.


On Thu, Aug 23, 2018 at 4:39 AM Isiah Meadows <isiahmeadows at gmail.com>
wrote:

> DataView has nothing to do with this performance-wise - implementations
> could just use the raw array buffer instead, and it'd be quicker and easier.
>
> The issue is string representation - engines can optimize for it in ways
> you can't at the language level.
>
> On Wed, Aug 22, 2018 at 16:51 Ken Russell <kbr at google.com> wrote:
>
>> If DataView's methods were faster, would this eliminate the need for this
>> addition?
>>
>> In the V8 engine, some recent work significantly improved DataView's
>> performance. See http://crbug.com/225811 for details.
>>
>> -Ken
>>
>>
>> On Tue, Aug 21, 2018 at 7:02 PM Isiah Meadows <isiahmeadows at gmail.com>
>> wrote:
>>
>>> You might be interested in some of the things I have here:
>>>
>>>
>>> https://github.com/isiahmeadows/array-additions-proposal/blob/master/README.md
>>>
>>> I'm not quite to the point of pushing TC39 to consider, but after I get
>>> around to sorting it better, I'd love to see all of them eventually
>>> implemented.
>>>
>>> Note that for some common cases, like to ASCII or UTF-8 when no
>>> character in the string is above U+007F or to UCS-2/UTF-16 when at least
>>> one character is above that, it's a trivial `memcpy` for multi-character
>>> strings.
>>> On Tue, Aug 21, 2018 at 18:32 Ali Rahbari <rahbari at gmail.com> wrote:
>>>
>>>> When working with binary data, reading string from buffer is a
>>>> necessity. For example when deserializing bson data in the browser.
>>>>
>>>> Most of node.js buffer methods are available in DataView except
>>>> toString. As a workaround currently it's done by reading the buffer byte by
>>>> byte and convert each of them to corresponding character with
>>>> String.fromCharCode() and then joining the result.
>>>>
>>>> This can be done much faster in native code.
>>>>
>>>> DataView.prototype.toString(start = 0, end = this.length, encoding=
>>>> "utf8")
>>>>
>>>> DataView.prototype.writeString(string, offset = 0, length =
>>>> string.length, encoding = "utf8")
>>>>
>>>> --
>>>> Ali Rahbari
>>>> _______________________________________________
>>>> es-discuss mailing list
>>>> es-discuss at mozilla.org
>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>

-- 
Ali Rahbari
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180823/77bb9657/attachment.html>

# Isiah Meadows (7 years ago)

I proposed array buffer interop, not data vire interop.

I proposed array buffer interop, not data vire interop.

On Wed, Aug 22, 2018 at 17:17 Ali Rahbari <rahbari at gmail.com> wrote:

> Logan mentioned TextDecoder which exactly do this but ArrayBuffer must
> first get sliced and then passed to TextDecoder.
> Still it's nice to have this inside DataView as Isiah proposal suggest.
>
>
> On Thu, Aug 23, 2018 at 4:39 AM Isiah Meadows <isiahmeadows at gmail.com>
> wrote:
>
>> DataView has nothing to do with this performance-wise - implementations
>> could just use the raw array buffer instead, and it'd be quicker and easier.
>>
>> The issue is string representation - engines can optimize for it in ways
>> you can't at the language level.
>>
>> On Wed, Aug 22, 2018 at 16:51 Ken Russell <kbr at google.com> wrote:
>>
>>> If DataView's methods were faster, would this eliminate the need for
>>> this addition?
>>>
>>> In the V8 engine, some recent work significantly improved DataView's
>>> performance. See http://crbug.com/225811 for details.
>>>
>>> -Ken
>>>
>>>
>>> On Tue, Aug 21, 2018 at 7:02 PM Isiah Meadows <isiahmeadows at gmail.com>
>>> wrote:
>>>
>>>> You might be interested in some of the things I have here:
>>>>
>>>>
>>>> https://github.com/isiahmeadows/array-additions-proposal/blob/master/README.md
>>>>
>>>> I'm not quite to the point of pushing TC39 to consider, but after I get
>>>> around to sorting it better, I'd love to see all of them eventually
>>>> implemented.
>>>>
>>>> Note that for some common cases, like to ASCII or UTF-8 when no
>>>> character in the string is above U+007F or to UCS-2/UTF-16 when at least
>>>> one character is above that, it's a trivial `memcpy` for multi-character
>>>> strings.
>>>> On Tue, Aug 21, 2018 at 18:32 Ali Rahbari <rahbari at gmail.com> wrote:
>>>>
>>>>> When working with binary data, reading string from buffer is a
>>>>> necessity. For example when deserializing bson data in the browser.
>>>>>
>>>>> Most of node.js buffer methods are available in DataView except
>>>>> toString. As a workaround currently it's done by reading the buffer byte by
>>>>> byte and convert each of them to corresponding character with
>>>>> String.fromCharCode() and then joining the result.
>>>>>
>>>>> This can be done much faster in native code.
>>>>>
>>>>> DataView.prototype.toString(start = 0, end = this.length, encoding=
>>>>> "utf8")
>>>>>
>>>>> DataView.prototype.writeString(string, offset = 0, length =
>>>>> string.length, encoding = "utf8")
>>>>>
>>>>> --
>>>>> Ali Rahbari
>>>>> _______________________________________________
>>>>> es-discuss mailing list
>>>>> es-discuss at mozilla.org
>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>
>>>> _______________________________________________
>>>> es-discuss mailing list
>>>> es-discuss at mozilla.org
>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>
>>>
>
> --
> Ali Rahbari
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180822/62f74959/attachment-0001.html>