Standardize ES Worker

# Park Hyeonu (8 years ago)

Now we're about to standardize thread-shared memory for EcmaScript( tc39/ecmascript_sharedmem). But how about the thread itself?

For this topic, I don't think we should develop another thread spec for ES as we already have a nice one - WebWorker. This spec allows multithreading in web client and it's well-adopted on most modern browsers. So all we need to do is just care some web-specific points of the api. Fortunately it's already new Worker() not new WebWorker() and not so many points should be fixed for it.

For the case to import web spec to ES spec, there's already a case like this - Typed Array. It's originally standardized in Chronos group as a part of WebGL spec, and adopted to ES later.

The spec I'm mentioning here is html5 apis including WebWorker, MessageChannel, Structured Clone Algorithm, Transferable Object.

Sounds nice, isn't it?

# Isiah Meadows (8 years ago)

Maybe with modification. I currently feel workers are a bit heavy (with their event driven nature), and they most definitely don't follow the idioms of the modern JavaScript language (very promise heavy).

# Park Hyeonu (8 years ago)

I agree that we need additional modification from WebWorker spec. What I mean on the first post is that we (likely) can make standard Worker spec as a (mostly) subset of WebWorker spec so current codes using WebWorker need not to be changed.

Anyway, can you explain it more about mismatch between WebWorker and modern js idioms?

      1. 오전 6:05에 "Isiah Meadows" <isiahmeadows at gmail.com>님이 작성:
# Park Hyeonu (8 years ago)

I originally thought such functionality can be implemented on top of current event based model, but request-response pattern is such commonly used for workers so it's worth to consider including this to spec.

But the current event namespace already has it's own usage so how about something like this?

// main.js
const worker = new Worker('worker.js')

const io = worker.stdio

io.send('someEvent', someData)
...
await io.once('ready') // yes, promise

const res1 = await io.request('someMethod', arg1, arg2) // resolved with
returned data

const res2 = await io.request('nonExistingMethod', weirdData) // rejected
promise, throw

// worker.js
const io = self.stdio

io.on('someEvent', data => { ... })

io.method('someMethod', (arg1, arg2) => { ... })

io.send('ready')

Worker#stdio will be an instance of IOChannel interface. It may also be used like below.

// event handling within for-await
for await (let ev of io.listen('someEvent')) {
  ...
}

// more than stdio
const [chan, remoteChan] = Worker.createChannel(optionalName)
worker.stdio.send('newChannel', remoteChan)
...
await chan.once('ready')
chan.request('greeting')

P.S. please CC es-discuss at mozilla.org to let esdiscuss.org track this thread properly

2016-10-27 4:20 GMT+09:00 Michael J. Ryan <tracker1 at gmail.com>:

# Michał Wadas (8 years ago)

Resending, because I forgotten to include es-discuss mail.

Anyway, can you explain it more about mismatch between WebWorker and modern

js idioms

Currently we put emphasis on request-response model - we request something from function (returning Promise/taking callback) and we wait for single return value. Workers are different beasts - they can emit messages on their own and don't have to emit ANY messages on completion.

I think API adressing this issue would look like:

// main.js const worker = new Worker('worker.js'); worker.request('ajaxCall', {url: 'example.com'}); // return Promise resolving to Response object worker.request('undefinedMethod'); // return rejected Promise

// worker.js

self.on('ajaxCall', (data)=>{ return Promise.resolve(new Response); });

# Boris Zbarsky (8 years ago)

On 10/27/16 9:48 AM, Michał Wadas wrote:

Currently we put emphasis on request-response model - we request something from function (returning Promise/taking callback) and we wait for single return value. Workers are different beasts - they can emit messages on their own and don't have to emit ANY messages on completion.

Right. The point of workers in the web platform is to do computation in a separate context. The computation need not be communicated back to the spawning page, because workers can do their own I/O.

// main.js const worker = new Worker('worker.js'); worker.request('ajaxCall', {url: 'example.com, example.com'}); // return Promise resolving to Response object worker.request('undefinedMethod'); // return rejected Promise

// worker.js

self.on('ajaxCall', (data)=>{ return Promise.resolve(new Response); });

Workers in the web platform have a shared-nothing model. The above example seems to assume that the Response and the Promise are either shared across the main script and the worker or auto-cloned at the boundary, right?

# Frankie Bagnardi (8 years ago)

It doesn't really need to clone anything, you just create a promise on each side, on top of the normal events. The way you'd implement this currently is a token that gets passed to the worker with the request payload, and gets sent back with the response payload.

It'd just be nice to save some work for a common use case, and require less cooperation between the worker and main code.

# Michał Wadas (8 years ago)

My example is polyfillable and include cloning final result of promise. Actually it's RPC built above current worker messaging capabilities. That's so prevalent use case that I'm convinced it should be built in language.

# Boris Zbarsky (8 years ago)

On 10/27/16 10:02 AM, Frankie Bagnardi wrote:

It doesn't really need to clone anything, you just create a promise on each side, on top of the normal events.

And how is the resolution value transferred over? This is the cloning step!

# Michał Wadas (8 years ago)

It's already handled by current specification as structured clone algorithm. Though having binary serialisation format would be cool.

# Boris Zbarsky (8 years ago)

On 10/27/16 1:47 PM, Michał Wadas wrote:

It's already handled by current specification as structured clone algorithm.

Sort of. Not all things can be structured cloned. A fetch Response, for example, can't be, and that's what the example being discussed was using...

# Michał Wadas (8 years ago)

OK, so if fetch Response can't be cloned that was my bad, sorry for confusion.

# Isiah Meadows (8 years ago)

Here's my idea for a new API, leveraging ES modules and the proposed (stage 2) dynamic import() proposal. It also supports the shared memory proposal.

  1. Add a new import.fork(script): Promise<Worker> method-like expression

that loads the worker and resolves when done/rejects if it couldn't for some reason.

  • receive is a function that accepts a message and may optionally return a value or thenable to it, which is cloned and returned to the worker's send call.
  1. Add the following methods/properties:
  • worker.terminate() - Terminate the worker
  • worker.send(message, sharedBuffers?) - Send a message and return a Promise to its result.
  • worker.receive(message) - A setter for a function that accepts a message and may optionally return a value or thenable to it, which is cloned and returned to the worker's send call.
  1. Load the worker as a module. The following exports are used specially, and they're both optional:
  • initialize(parent) - A function that accepts a parent with two methods: parent.terminate() to terminate the worker's own thread and parent.send being equivalent to worker.send above (except in the other direction). This is called immediately after the parent's wrapping promise resolves.
  • receive(message) - Receive messages from the parent, and works similarly to worker.receive.

I chose syntax for similar reasons Domenic chose syntax for his dynamic import proposal, for example, the modules can be statically resolved, which enables prefetching.

I also chose modules to leverage themodule system to my advantage, in particular to avoid adding new globals.

# Boris Zbarsky (8 years ago)

On 10/27/16 6:18 PM, Isiah Meadows wrote:

  1. Add a new import.fork(script): Promise<Worker> method-like expression that loads the worker and resolves when done/rejects if it couldn't for some reason.

What does "done" mean in this case?

  • worker.send(message, sharedBuffers?) - Send a message and return a Promise to its result.

What if the "result" of the message is a sequence of messages back? This is a common thing to do with web workers: ask it to do something, and it sends the answer back in chunks.

# Park Hyeonu (8 years ago)

+1 for import.fork(), especially for deep integration with module syntax.

Anyway I think we still need to create Independent message channel which can be transferable, for use cases like returning sequence of messages, central worker manager, etc.

I agree with your approach to avoid new global variable as much as possible. To respect this, how about Channel#createChannel method?

const [chan, remoteChan] = worker.createChannel()
worker.send('newchan', remoteChan [remoteChan])
      1. 오전 7:18에 "Isiah Meadows" <isiahmeadows at gmail.com>님이 작성:
# Isiah Meadows (8 years ago)

Inline

On Thu, Oct 27, 2016, 21:42 Boris Zbarsky <bzbarsky at mit.edu> wrote:

On 10/27/16 6:18 PM, Isiah Meadows wrote:

  1. Add a new import.fork(script): Promise<Worker> method-like expression that loads the worker and resolves when done/rejects if it couldn't for some reason.

What does "done" mean in this case?

When the worker has finished loading, so you can send and receive messages

  • worker.send(message, sharedBuffers?) - Send a message and return a Promise to its result.

What if the "result" of the message is a sequence of messages back? This is a common thing to do with web workers: ask it to do something, and it sends the answer back in chunks.

The result is the value returned from the worker's onMessage. This does permit async return values, because it's fairly common (at least in my case) to send a message and expect a response as effectively a return value, and I would like to reify that.

# Isiah Meadows (8 years ago)

I just specified some basic semantics. That sounds like a good idea, but I feel it should just remain a WHATWG extension for now. Technically, you can still manage ids to ensure it ends up properly coordinated (you already had to do this when working at scale).

On Sun, Oct 30, 2016, 23:44 Park Hyeonu <nemo1275 at gmail.com> wrote:

+1 for import.fork(), especially for deep integration with module syntax.

Anyway I think we still need to create Independent message channel which can be transferable, for use cases like returning sequence of messages, central worker manager, etc.

I agree with your approach to avoid new global variable as much as possible. To respect this, how about Channel#createChannel method?

const [chan, remoteChan] = worker.createChannel()
worker.send('newchan', remoteChan [remoteChan])
      1. 오전 7:18에 "Isiah Meadows" <isiahmeadows at gmail.com>님이 작성:

Here's my idea for a new API, leveraging ES modules and the proposed (stage 2) dynamic import() proposal. It also supports the shared memory proposal.

  1. Add a new import.fork(script): Promise<Worker> method-like expression

that loads the worker and resolves when done/rejects if it couldn't for some reason.

  • receive is a function that accepts a message and may optionally return a value or thenable to it, which is cloned and returned to the worker's send call.
  1. Add the following methods/properties:
  • worker.terminate() - Terminate the worker
  • worker.send(message, sharedBuffers?) - Send a message and return a Promise to its result.
  • worker.receive(message) - A setter for a function that accepts a message and may optionally return a value or thenable to it, which is cloned and returned to the worker's send call.
  1. Load the worker as a module. The following exports are used specially, and they're both optional:
  • initialize(parent) - A function that accepts a parent with two methods: parent.terminate() to terminate the worker's own thread and parent.send being equivalent to worker.send above (except in the other direction). This is called immediately after the parent's wrapping promise resolves.
  • receive(message) - Receive messages from the parent, and works similarly to worker.receive.

I chose syntax for similar reasons Domenic chose syntax for his dynamic import proposal, for example, the modules can be statically resolved, which enables prefetching.

I also chose modules to leverage themodule system to my advantage, in particular to avoid adding new globals.

# Boris Zbarsky (8 years ago)

On 10/31/16 8:42 AM, Isiah Meadows wrote:

When the worker has finished loading, so you can send and receive messages

OK, what about a worker that when it loads just starts and infinite loop and starts sending you messages (but obviously never expects any messages from you, since it's in an infinite loop)?

Or is the idea to not support this behavior? If so, I'm a little worried about specifying something with totally different behavior from DOM Workers and calling it "Worker".

What if the "result" of the message is a sequence of messages back?
This is a common thing to do with web workers: ask it to do something,
and it sends the answer back in chunks.

The result is the value returned from the worker's onMessage. This does permit async return values, because it's fairly common (at least in my case) to send a message and expect a response as effectively a return value, and I would like to reify that.

I think we're talking past each other a bit here, but please see above about naming.

# Isiah Meadows (8 years ago)

Inline.

On Mon, Oct 31, 2016, 10:33 Boris Zbarsky <bzbarsky at mit.edu> wrote:

On 10/31/16 8:42 AM, Isiah Meadows wrote:

When the worker has finished loading, so you can send and receive messages

OK, what about a worker that when it loads just starts and infinite loop and starts sending you messages (but obviously never expects any messages from you, since it's in an infinite loop)?

You still add the onMessage handler then. The promise is to encapsulate errors that occur when creating the worker (e.g. the file doesn't exist).

To clarify, the messages sent since the last tick, and returned promises resolved, are queued after the current tick ends.

Also, both exports from the worker's side are optional.

Or is the idea to not support this behavior? If so, I'm a little worried about specifying something with totally different behavior from DOM Workers and calling it "Worker".

It's supported. See above.

What if the "result" of the message is a sequence of messages back?
This is a common thing to do with web workers: ask it to do

something,

and it sends the answer back in chunks.

The result is the value returned from the worker's onMessage. This does permit async return values, because it's fairly common (at least in my case) to send a message and expect a response as effectively a return value, and I would like to reify that.

I think we're talking past each other a bit here, but please see above about naming.

Yeah...I don't think we were on the same page.