Idiomatic representation of { buffer, bytesRead }

# Domenic Denicola (10 years ago)

While working on lower-level byte streams we're encountering a number of situations that need to return something along the lines of { buffer, bytesRead }. (In this setting "buffer" = ArrayBuffer.) In the most general form the signature ends up being something like

{ sourceBuffer, offset, bytesDesired } -> { newBuffer, bytesRead }

where sourceBuffer gets detached, then (in some other thread, most likely) up to bytesDesired bytes get written in at position offset to the backing memory, then the backing memory gets transferred to newBuffer, and bytesRead tells you how many bytes were actually read into the buffer.

I was hoping to get opinions on the most idiomatic way to represent this type in JavaScript. So far I can think of a few options:

Just an object literal, probably with names { transferred, bytesRead }
A Uint8Array view onto the new buffer, starting at offset and extending bytesRead. I.e., return new Uint8Array(result.transferred, input.offset, result.bytesRead);.
A DataView view onto the buffer, similar to 2.
An ArrayBuffer with an additional property added!? I.e. result.transferred.bytesRead = result.bytesRead; return result.transferred;

1 is unambiguous, but a bit awkward, and in general does not compose well if we try to make byte streams a special case of more general streams (which is a goal).

Both 2 and 3 are essentially attempting to smuggle the two pieces of information into one object. 2 takes the "byte" idea literally, whereas 3 uses DataView since it feels more "agnostic." In both cases you can access the underlying buffer using view.buffer so no generality is lost. I would be especially interested in peoples' opinions on 2 vs 3.

4 I just thought up while composing this email and is probably not such a great idea. But, I think it does work.

I wrote up a specialized version of 2 in some detail under a number of different scenarios at: gist.github.com/domenic/65921459ef7a31ec2839. Of particular interest might be gist.github.com/domenic/65921459ef7a31ec2839#a-two-buffer-pool-for-a-file-stream

While working on lower-level byte streams we're encountering a number of situations that need to return something along the lines of `{ buffer, bytesRead }`. (In this setting "buffer" = ArrayBuffer.) In the most general form the signature ends up being something like

    { sourceBuffer, offset, bytesDesired } -> { newBuffer, bytesRead }

where `sourceBuffer` gets detached, then (in some other thread, most likely) up to `bytesDesired` bytes get written in at position `offset` to the backing memory, then the backing memory gets [transferred][1] to `newBuffer`, and `bytesRead` tells you how many bytes were actually read into the buffer.

I was hoping to get opinions on the most idiomatic way to represent this type in JavaScript. So far I can think of a few options:

1. Just an object literal, probably with names `{ transferred, bytesRead }`
2. A Uint8Array view onto the new buffer, starting at `offset` and extending `bytesRead`. I.e., `return new Uint8Array(result.transferred, input.offset, result.bytesRead);`.
3. A DataView view onto the buffer, similar to 2.
4. An ArrayBuffer with an additional property added!? I.e. `result.transferred.bytesRead = result.bytesRead; return result.transferred;`

1 is unambiguous, but a bit awkward, and in general does not compose well if we try to make byte streams a special case of more general streams (which is a goal).

Both 2 and 3 are essentially attempting to smuggle the two pieces of information into one object. 2 takes the "byte" idea  literally, whereas 3 uses DataView since it feels more "agnostic." In both cases you can access the underlying buffer using `view.buffer` so no generality is lost. I would be especially interested in peoples' opinions on 2 vs 3.

4 I just thought up while composing this email and is probably not such a great idea. But, I think it does work.

I wrote up a specialized version of 2 in some detail under a number of different scenarios at: https://gist.github.com/domenic/65921459ef7a31ec2839. Of particular interest might be https://gist.github.com/domenic/65921459ef7a31ec2839#a-two-buffer-pool-for-a-file-stream

[1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer/transfer

# Jason Orendorff (10 years ago)

On Mon, Mar 2, 2015 at 3:45 PM, Domenic Denicola <d at domenic.me> wrote:

While working on lower-level byte streams we're encountering a number of situations that need to return something along the lines of { buffer, bytesRead }. (In this setting "buffer" = ArrayBuffer.) In the most general form the signature ends up being something like
{ sourceBuffer, offset, bytesDesired } -> { newBuffer, bytesRead }

I very much like 2 and 3 because they provide the result type that the user wants anyway. Slightly prefer DataView.

But you can support both, like this:

pull(DataView) -> Promise<DataView>
pull(TypedArrayView) -> Promise<TypedArrayView of the same type>

A view argument conveniently provides just the three pieces of information you need, plus a type.

The lower-level primitive could take an optional fourth argument:

pull(sourceBuffer, offset, bytesDesired,

resultConstructor=DataView) -> Promise<resultConstructor>

This could even be generic in resultConstructor, though it's a little awkward because you have to divide by resultConstructor.BYTES_PER_ELEMENT before invoking the constructor.

On Mon, Mar 2, 2015 at 3:45 PM, Domenic Denicola <d at domenic.me> wrote:
> While working on lower-level byte streams we're encountering a number of situations that need to return something along the lines of `{ buffer, bytesRead }`. (In this setting "buffer" = ArrayBuffer.) In the most general form the signature ends up being something like
>
>     { sourceBuffer, offset, bytesDesired } -> { newBuffer, bytesRead }

I very much like 2 and 3 because they provide the result type that the
user wants anyway. Slightly prefer DataView.

But you can support both, like this:

    pull(DataView) -> Promise<DataView>
    pull(TypedArrayView) -> Promise<TypedArrayView of the same type>

A view argument conveniently provides just the three pieces of
information you need, plus a type.

The lower-level primitive could take an optional fourth argument:

    pull(sourceBuffer, offset, bytesDesired,
resultConstructor=DataView) -> Promise<resultConstructor>

This could even be generic in resultConstructor, though it's a little
awkward because you have to divide by
resultConstructor.BYTES_PER_ELEMENT before invoking the constructor.

-j

# Domenic Denicola (10 years ago)

Thanks very much for the feedback Jason!

But you can support both, like this:
pull(DataView) -> Promise<DataView>
pull(TypedArrayView) -> Promise<TypedArrayView of the same type>
A view argument conveniently provides just the three pieces of information you need, plus a type.

I thought of that. However, I found it a bit strange that passing this function a view onto bytes [256, 512] of a 1024-byte buffer would detach the entire 1024-byte buffer. What do you think?

Thanks very much for the feedback Jason!


> But you can support both, like this:
> 
>     pull(DataView) -> Promise<DataView>
>     pull(TypedArrayView) -> Promise<TypedArrayView of the same type>
> 
> A view argument conveniently provides just the three pieces of information
> you need, plus a type.

I thought of that. However, I found it a bit strange that passing this function a view onto bytes [256, 512] of a 1024-byte buffer would detach the entire 1024-byte buffer. What do you think?

# Jason Orendorff (10 years ago)

On Wed, Mar 4, 2015 at 3:06 PM, Domenic Denicola <d at domenic.me> wrote:

I thought of that. However, I found it a bit strange that passing this function a view onto bytes [256, 512] of a 1024-byte buffer would detach the entire 1024-byte buffer. What do you think?

It's a good point. I figured most callers will have allocated the buffer themselves, most will only have one view into it at a time, and most will ask for the whole thing to be filled; and by "most" I really mean somewhere over 99.9%. Skimming your gist seemed to sort of confirm my hunch, but don't take my word for it -- that's all the research I did. All I know is, I've known about Python's file.readinto() method for at least 15 years and never yet had a need for it.

Would it make it seem less strange if you specified the argument as a "dictionary" with these properties:

{buffer:, byteOffset:, byteLength:, constructor:}

...and then casually mention that DataViews and TypedArrays both happen to quack in just this way?

On Wed, Mar 4, 2015 at 3:06 PM, Domenic Denicola <d at domenic.me> wrote:
> I thought of that. However, I found it a bit strange that passing this function a view onto bytes [256, 512] of a 1024-byte buffer would detach the entire 1024-byte buffer. What do you think?

It's a good point. I figured most callers will have allocated the
buffer themselves, most will only have one view into it at a time, and
most will ask for the whole thing to be filled; and by "most" I really
mean somewhere over 99.9%. Skimming your gist seemed to sort of
confirm my hunch, but don't take my word for it -- that's all the
research I did. All I know is, I've known about Python's
[file.readinto()
method](https://docs.python.org/3.5/library/io.html#io.RawIOBase.readinto)
for at least 15 years and never yet had a need for it.

Would it make it seem less strange if you specified the argument as a
"dictionary" with these properties:

    {buffer:, byteOffset:, byteLength:, constructor:}

...and then casually mention that DataViews and TypedArrays both
happen to quack in just this way?

-j

# Kevin Smith (10 years ago)

But you can support both, like this:
pull(DataView) -> Promise<DataView>
pull(TypedArrayView) -> Promise<TypedArrayView of the same type>
A view argument conveniently provides just the three pieces of information you need, plus a type.
I thought of that. However, I found it a bit strange that passing this function a view onto bytes [256, 512] of a 1024-byte buffer would detach the entire 1024-byte buffer. What do you think?

Not sure if it makes any difference or not, but it's interesting to me that a signature like the above maps very well to async iterators:

source.next(myDataView) -> Promise<DataView>

I implemented a zip/tar utility when validating the async iterator design and basically used that signature (except with Node Buffers instead of DataViews). The basic strategy I used was to construct a pipeline with async iterators:

source -> data pump -> transform -> data pump -> consumer

where each "data pump" would have a pool of buffers, and would pull from the upstream iterator, providing the next unused buffer in its pool as argument to "next". The downstream transform or consumer would then just pull the next filled buffer (view) from the upstream pool.

>
> > But you can support both, like this:
> >
> >     pull(DataView) -> Promise<DataView>
> >     pull(TypedArrayView) -> Promise<TypedArrayView of the same type>
> >
> > A view argument conveniently provides just the three pieces of
> information
> > you need, plus a type.
>
> I thought of that. However, I found it a bit strange that passing this
> function a view onto bytes [256, 512] of a 1024-byte buffer would detach
> the entire 1024-byte buffer. What do you think?
>

Not sure if it makes any difference or not, but it's interesting to me that
a signature like the above maps very well to async iterators:

    source.next(myDataView) -> Promise<DataView>

I implemented a zip/tar utility when validating the async iterator design
and basically used that signature (except with Node Buffers instead of
DataViews).  The basic strategy I used was to construct a pipeline with
async iterators:

    source -> data pump -> transform -> data pump -> consumer

where each "data pump" would have a pool of buffers, and would pull from
the upstream iterator, providing the next unused buffer in its pool as
argument to "next".  The downstream transform or consumer would then just
pull the next filled buffer (view) from the upstream pool.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150305/990d3088/attachment-0001.html>

# Kevin Smith (10 years ago)

source.next(myDataView) -> Promise<DataView>

Derp. Should have been:

source.next(dataView) -> Promise<IteratorResult<DataView>>

which you kind of need anyway ; )

>
>
>     source.next(myDataView) -> Promise<DataView>
>

Derp.  Should have been:

    source.next(dataView) -> Promise<IteratorResult<DataView>>

which you kind of need anyway  ; )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150305/8356062d/attachment.html>

# Domenic Denicola (10 years ago)

From: Jason Orendorff [mailto:jason.orendorff at gmail.com]

Would it make it seem less strange if you specified the argument as a "dictionary" with these properties:
{buffer:, byteOffset:, byteLength:, constructor:}
...and then casually mention that DataViews and TypedArrays both happen to quack in just this way?

Yes, this in fact makes me very happy :)

From: Jason Orendorff [mailto:jason.orendorff at gmail.com]


> Would it make it seem less strange if you specified the argument as a
> "dictionary" with these properties:
> 
>     {buffer:, byteOffset:, byteLength:, constructor:}
> 
> ...and then casually mention that DataViews and TypedArrays both happen
> to quack in just this way?

Yes, this in fact makes me very happy :)

# Bergi (10 years ago)

Kevin Smith schrieb:

Should have been:
 source.next(dataView) -> Promise<IteratorResult<DataView>>
which you kind of need anyway ; )

Wouldn't .next() rather need to return an IteratorResult<Promise<DataView>>?

Bergi

Kevin Smith schrieb:
> Should have been:
>
>      source.next(dataView) -> Promise<IteratorResult<DataView>>
>
> which you kind of need anyway  ; )

Wouldn't `.next()` rather need to return an 
IteratorResult<Promise<DataView>>?

  Bergi

# Kevin Smith (10 years ago)

Wouldn't .next() rather need to return an IteratorResult<Promise< DataView>>?

No, because the "done-ness" of the iteration is itself asynchronous. You don't know whether you're done iterating until the promise resolves.

zenparsing/async-iteration

>
>
> Wouldn't `.next()` rather need to return an IteratorResult<Promise<
> DataView>>?


No, because the "done-ness" of the iteration is itself asynchronous.  You
don't know whether you're done iterating until the promise resolves.

https://github.com/zenparsing/async-iteration/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20150305/02b9a519/attachment.html>

# Bergi (10 years ago)

Kevin Smith schrieb:

Wouldn't .next() rather need to return an IteratorResult<Promise< DataView>>?

No, because the "done-ness" of the iteration is itself asynchronous. You don't know whether you're done iterating until the promise resolves.

zenparsing/async-iteration

Oh, right, I should've read your first post properly - you were talking about future asynchronous generators.

Bergi

Kevin Smith schrieb:
>>
>>
>> Wouldn't `.next()` rather need to return an IteratorResult<Promise<
>> DataView>>?
>
>
> No, because the "done-ness" of the iteration is itself asynchronous.  You
> don't know whether you're done iterating until the promise resolves.
>
> https://github.com/zenparsing/async-iteration/

Oh, right, I should've read your first post properly - you were talking 
about future *asynchronous* generators.

  Bergi