Suggestion: Add Standard IO Streams

# Chet Michals (6 years ago)

Working between a number of different ECMAScript environments over the years, such as in Java with Rhino and Nashorn, Node.js, and various web browsers, one thing I have noticed is that there is no standard input/output/error stream like most other languages support, and each ecosystem tends to define their own host objects to deal with these, like the console object in most Web Browsers (Which is at least defined in a WHATWG Living Standard), the process object in Node.js, and the print object in Nashorn.

I feel for long term portability, the 3 standard IO streams should be added to the spec in some way,

Is there a reason I am not seeing as to why this wouldn't be desired?

# Isiah Meadows (6 years ago)

One good area where you don't want heavy streams is in memory-constrained devices. Also, streams are inherently very OS-dependent, but hardware-independent, and they aren't very broadly useful.

I'd like this to be solved cohesively out of spec first, given the half dozen or so existing formalisms.

(Not a TC39 member, so take this with a grain of salt.)

# Michael J. Ryan (6 years ago)

Why not create an npm module that represents what you'd like to see as an interface wrapping node's implementation and propose your new interface?

This way you can try building something with it. You'll first need to implement how you see a synchronous stream in the first place.

Streams can be a complex beast though. Do you want a synchronous or asynchronous implementation or both? Will they consume like generators, async iteration (for await of)? What about queueing and back pressure?

# chetmichals at gmail.com (6 years ago)

Yeah, thinking of it a bit more carefully, we would need to define a Standard Stream Interface in order to have the 3 Standard IO Streams. I can start putting together some research and a sample interface to present my ideas in more depth, but it'll take some time to flesh everything out. But we can still start hashing out some ideas in the meantime, and making sure concerns are addressed.

I believe we should define both an synchronous and asynchronous API day one, where the "normal" reads would be blocking and return the value, if the "normal" writes are blocking or not would be host environment specific, and the asynchronous versions of both would return promises. So if there is a read/write, there would also be a readAsync/writeAsync. Another option instead of doubling up all the functions for a single Stream Object, is to define both a Stream and Async Stream Object, and have a method to get the other kind from in both. So a normal synchronous Stream could have an Async() method to get the Async Stream version, and the Async version can have a Sync() method to get the normal Stream version.

The easy out for how to deal with queueing and back pressure is that it is implementation details of the host environment. Ideally a user would be able to use both the synchronous and asynchronous functions on a Stream, even if the Host Environment only actually supports one kind. But I don't think it’s our place to solve for a user trying to write 100GB of data asynchronously or read a 100GB of data synchronously. I believe what we can do is define a Standard Exception that can be thrown if the buffers overflow, let the host environment define how large their buffers are, and maybe give them the option of throwing out data if desired (Maybe for some hypothetical host environment in the future, the Standard Error's Buffer overflowing and losing data is acceptable behavior and does not need to raise an exception)

One thing to note, just because the language defines standard Streams, does not mean we necessarily have to define how a user can create their own Streams at this time. The day one version could make Streams only something setup by host environment. Host Environments that don't want to support a standard in could have an object that just throws an exception if a user tries to output any data, and those that don't want to support a standard out/error could have the functions stubbed out.

So, some things that come to mind also, should these Streams only support text data, or should they be usable for binary data too? And should there be a single Base Stream class, or should the Input Stream and Output Stream be different things (If they are a single base class, a function could be added to allow for code to check the Stream Type).

I think for the first passes, generators and async iteration don’t need to be considered, and can be taken into account for once it starts taking shape.

From: Michael J. Ryan <tracker1 at gmail.com>

Sent: Sunday, April 29, 2018 3:47 PM To: Chet Michals <chetmichals at gmail.com>

Cc: es-discuss <es-discuss at mozilla.org>

Subject: Re: Suggestion: Add Standard IO Streams

Why not create an npm module that represents what you'd like to see as an interface wrapping node's implementation and propose your new interface?

This way you can try building something with it. You'll first need to implement how you see a synchronous stream in the first place.

Streams can be a complex beast though. Do you want a synchronous or asynchronous implementation or both? Will they consume like generators, async iteration (for await of)? What about queueing and back pressure?

On Sat, Apr 28, 2018, 22:28 Chet Michals <chetmichals at gmail.com <mailto:chetmichals at gmail.com> > wrote:

Working between a number of different ECMAScript environments over the years, such as in Java with Rhino and Nashorn, Node.js, and various web browsers, one thing I have noticed is that there is no standard input/output/error stream like most other languages support, and each ecosystem tends to define their own host objects to deal with these, like the console object in most Web Browsers (Which is at least defined in a WHATWG Living Standard), the process object in Node.js, and the print object in Nashorn.

I feel for long term portability, the 3 standard IO streams should be added to the spec in some way,

Is there a reason I am not seeing as to why this wouldn't be desired?

# Isiah Meadows (6 years ago)

I'll make one big recommendation: before you try to formulate a solution yourself, please do check this out: kriskowal/gtor

You really have to understand what the status quo is (I presume you do) and why the status quo is the way it is (this is much less common) before you can realistically come up with a decent primitive.


To hit on the other points:

  1. Although it might not look it, the proposed Observable 1 is synchronous in nature. It's not the most obvious or intuitive, but there are reasons why it needs not only sync send, but also sync error and complete. So you can make do with a purely synchronous stream - users can always defer work via Promise.resolve().then(() => { ... }).

  2. If you wanted to introduce a backpressure mechanism, you'd have to create a means of also automatically batching data, or it'd be worthless on its own (you could just use an array and push/shift from it instead). Node has this builtin to its streams (and it emits asynchronously), but RxJS expects the consumer to do its own backpressure management, and does not provide a means of pushing multiple values in a single batch. This is hard to do right, but I think this could be done as an Observable subclass that operates on binary batched data, sending after the next microtask (or on a method call) all previously written bytes. In this case, a native implementation here would be far better than what a user-level implementation could provide, since it can do a lot to avoid allocation and complex memory access that a user-level implementation can't really.

  3. Streams are typically in one of two forms: lazy (which async iterators can already handle) and eager (which are well-suited for observables). Streams typically conflate these two, but async iterators and observables separate them. Note that separating them makes it simpler and more efficient, since with async iterators, you don't need to allocate a buffer except for when adapting observables, and for observables, you don't need to wait for a request before scheduling a read. (With both conflated, you need to both have a buffer and a boolean to know whether to eagerly read or not.)

  4. I would find that operating on binary data is of higher priority than operating on string data. About the only thing needed here for string data is a means of translating a string between binary encodings (like ArrayBuffer.from('some string', encoding?) + buf.decode(encoding?, start = 0, end = buf.byteLength) + streaming equivalents of both, requiring support for ASCII, UTF-8, UTF-16BE, and UTF-16LE, but offering hooks for other encodings like Base64). It's possible to do this from userland, but engines don't need to go through nearly the overhead to convert them, especially for UTF-8, UTF-16 (JS's native encoding), and ASCII (optimized for in some engines).


Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com