Subject: Re: Harmony - proxies | asynchronous
Node has shared memory between actors, having multiple async code flows executed on shared memory that appear to be sync by design is very hard to do. This leads to implementation specific behavior unless we standardize edit resolution. Most people who complain about nested callbacks generally forget that you can abstract away event delegation to objects and objects can manage multiple states. Also, if you are nesting many callbacks, you should probably try to split up the functionality to logical pieces, our code base of 20k+ lines of Node rarely if ever sees more than 3 nested functions (including the original invoker).
While memory resolution is possible and able to be standardized it can prove difficult when working with Native object where state of memory may be cached in C/C++/etc. Web workers have provided a nice share-nothing abstraction that is probably more suitable than overhauling the fact that closures can share the same memory. On top of this, critical expectations of sync behavior would require any memory access that could affect the shared memory be either locked and prevented or rolled back at a later time.
The DOM's live NodeLists are a good example of shared memory causing issues in this area, proposing all proxies be allowed to do this is would require the same care that every time we deal with a live NodeList requires. Even worse, libraries must support this, similar to how a few libraries that break when a script in strict mode invokes them but unlike frowned upon features, we are talking about every feature that a proxy has would have to be guarded for this.
Monads through callbacks instead of hiding async behavior through proxies is a better idea not just because of the complex compilation changes needed to support this, but also because a user of an API can expect for something to have an unknown return time with callbacks. Explicit behavior will always be more appropriate than "magic" async behavior appearing synchronous with shared memory is what leads to hard to debug race conditions. Semaphores will become needed to lock resources that merely may be affected by something that may be occuring but is not guaranteed to be. Having the compiler do this would lead to nightmares. Not sharing memory between concurrent flows of control and not splitting flows of control that may have preemption is much more sane.
If you wish to discover more about this I would suggest looking at the solutions in shared memory systems for C++, Node, and Java. C++ and Java will be using semaphores to cage potential race conditions, while Node generally can get away with counters (we have seen cases where when we have to use workers we rely on semaphores so I would suggest looking at process race conditions in these as well).
On Sep 2, 2011, at 2:26 PM, John J Barton wrote:
I'm pretty puzzled by this discussion and I'm guessing other folks might be puzzled as well. Since I understood node fibers as "thread for Node", the discussion I read is:
/be: You can have threads!
I did not say that. Remember, threads suck.
Generators are not threads. Generators are shallow one-shot continuations. They do not even imply an event loop turn per yield. But they are co-expressive with, and more primitive than, deferred/async functions (now that we have PEP-380 style return e; from generator body).
With generators one can build up a variety of async libraries that do not require manual CPS conversion via the dreaded function-expression nest.
Mikeal: We don't want threads!
Again the issue is not threads. It is what idiom or keyword signals the reader that a voluntary suspension point has been reached and later or deeper-nested code runs in a different turn (or may run in a different turn).
Node.JS is a single-threaded server and always will be. No threads, not even cooperatively scheduled.
If I'm on the right track, then I should understand how this relates to proxies. But I don't. Any hints?
Proxies are handy for making promises have an API that looks like plain old object API. obj.foo instead of obj.get("foo"), etc.
On Fri, Sep 2, 2011 at 3:48 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 2, 2011, at 2:26 PM, John J Barton wrote:
I'm pretty puzzled by this discussion and I'm guessing other folks might be puzzled as well. Since I understood node fibers as "thread for Node", the discussion I read is:
/be: You can have threads!
I did not say that. Remember, brendaneich.com/2007/02/threads-suck t brendaneich.com/2007/02/threads-suckhreads suck.
That's good news, thanks for clearing that much up.
Generators are not threads. Generators are shallow one-shot continuations. They do not even imply an event loop turn per yield. But they are co-expressive with, and more primitive than, deferred/async functions (now that we have PEP-380 style return e; from generator body).
With generators one can build up a variety of async libraries that do not require manual CPS conversion via the dreaded function-expression nest.
Ok I hope someone creates more tutorial information about generators. I read about them and played around with some examples, but I did not come away thinking positive.
Mikeal: We don't want threads!
Again the issue is not threads. It is what idiom or keyword signals the reader that a voluntary suspension point has been reached and later or deeper-nested code runs in a different turn (or may run in a different turn).
Node.JS is a single-threaded server and always will be. No threads, not even cooperatively scheduled.
If I'm on the right track, then I should understand how this relates to proxies. But I don't. Any hints?
Proxies are handy for making promises have an API that looks like plain old object API. obj.foo instead of obj.get("foo"), etc.
Thanks, that's much clearer. So proxies are a generalization of getters.
As for the application, reasoning about code that looks like hash-table lookup but acts very different may not be better than reasoning about code that looks like spaghetti but acts like you expect.
jjb
On Sep 2, 2011, at 4:15 PM, John J Barton wrote:
Ok I hope someone creates more tutorial information about generators. I read about them and played around with some examples, but I did not come away thinking positive.
You might start with Dave Herman's async library, task.js:
Thanks, that's much clearer. So proxies are a generalization of getters.
See also brendaneich.com/2010/11/proxy-inception (thanks to Tom and Mark for much of the pretty slideware).
As for the application, reasoning about code that looks like hash-table lookup but acts very different may not be better than reasoning about code that looks like spaghetti but acts like you expect.
The two are not easy to trade off. Spaghetti is the bigger problem, IMHO, since you can make proxies behave well (unlike host objects, e.g. DOM nodelists, which can be implemented with proxies [with a few edge-case extensions] but which act like "live cursors").
API design taste is still required to avoid making gratuitously magical objects. Promises are a non-magical use-case, IMHO.
In case someone is interested I've experimented with proxy sugared promises (it was long time ago some few things might require updates):
-- Irakli Gozalishvili Web: www.jeditoolkit.com Address: 29 Rue Saint-Georges, 75009 Paris, France (goo.gl/maps/3CHu)
As for the application, reasoning about code that looks like hash-table lookup but acts very different may not be better than reasoning about code that looks like spaghetti but acts like you expect.
The two are not easy to trade off. Spaghetti is the bigger problem, IMHO, since you can make proxies behave well (unlike host objects, e.g. DOM nodelists, which can be implemented with proxies [with a few edge-case extensions] but which act like "live cursors").
API design taste is still required to avoid making gratuitously magical objects. Promises are a non-magical use-case, IMHO.
This is what is starting to happen in practice.
Node programs work with Streams (objects that emit data). Node defines some important core objects (HTTP requests and responses, files, etc) that are all Streams.
Libraries define Stream objects that take an input stream and pipe themselves to an output stream.
Those objects have a user defined state and other state that gets defined by their inputs and outputs.
There are little to no callbacks in the application code, it's all in libraries that are handling the state changes.
Example:
www.mikealrogers.com/posts/request-20.html
Yes, these are "promisy" objects in that they have different events and state changes in the future. They are different from most generic promise proposals in that they require initializing state to be setup in one call from the event loop (before nextTick() in node.js).
Spagetti code is mostly found in poor programs and contrived examples.
Streams and pipes as flow control, that's how good node programs work. We can't move everything to generators because Streams and pipes also have semantics for handling back pressure through event propagation. Nor can we take an off-the-shelf promise and/or defer that doesn't emit events, for the same reason.
None of the proposed alternative IO handlers for node, Harmony proposals, fibers, coro, whatever, give us a simpler HTTP proxy than:
http.createServer(function (req, resp) { req.pipe(request(req.url)).pipe(resp) })
This is why i get a little pissed when people who aren't writing node.js programs tell me about "spaghetti code".
On Sep 2, 2011, at 3:01 PM, Bradley Meck wrote:
Even worse, libraries must support this, similar to how a few libraries that break when a script in strict mode invokes them but unlike frowned upon features,
Strict mode is a static property of code. Strict callers cannot affect behavior of non-strict callees. So I'm not sure what you mean here. Can you give an example?
we are talking about every feature that a proxy has would have to be guarded for this.
In the browser, DOM and host object precedent means we crossed this bridge a long time ago.
IINM, Node has built-in objects that can do magical host-object-like things too. Has it therefore crossed the same bridge too? That is, couldn't such Node host objects be passed to library code that cannot assume plain-old-native-object non-magical semantics?
Monads through callbacks instead of hiding async behavior through proxies is a better idea not just because of the complex compilation
Ok, my b.s. detector is going off. You used "actors" in the first para about Node, but Node is JS and JS (or Mikeal's favorite subset used in Node built-in and approved libraries) is not an Actor language.
"Monads" here sounds impressive but JS has mutation and dynamic types, so what do you really mean?
Finally, proxies do not affect execution model at all, so whether a promise uses proxies for "catch-all" get and set interception, or forsakes proxies in favor of method calls, is not material.
changes needed to support this, but also because a user of an API can expect for something to have an unknown return time with callbacks.
Proxies do not introduce hidden preemption points with attendant data race hazards. At all. Why did you think they could?
Ok, my b.s. detector is going off. You used "actors" in the first para about Node, but Node is JS and JS (or Mikeal's favorite subset used in Node built-in and approved libraries) is not an Actor language.
My ability to dictate style in core is close to zero, if I could we wouldn't be running jslint in strict Crockie mode on every checkin.
The only person who can dictate anything in node is Ryan, I just happen to agree with his decisions and priorities on using new language features.
On Sep 2, 2011, at 5:16 PM, Mikeal Rogers wrote:
Ok, my b.s. detector is going off. You used "actors" in the first para about Node, but Node is JS and JS (or Mikeal's favorite subset used in Node built-in and approved libraries) is not an Actor language.
My ability to dictate style in core is close to zero, if I could we wouldn't be running jslint in strict Crockie mode on every checkin.
The only person who can dictate anything in node is Ryan, I just happen to agree with his decisions and priorities on using new language features.
Sorry, my parenthetical aside was naming you on account of this tweet:
twitter.com/#!/mikeal/status/109740555025661954
I kid (mostly; also Ryan is BDFL). But back to the main topic: proxies don't affect JS's execution model.
Proxies do affect whether you can reason about objects by assuming no JS code runs in a meta-level trap backstage of an on-stage base-level semantic operation such as o.p or 'p' in o. Getters and setters already crossed a bridge or two here, but proxies up the ante. Again this has nothing to do with execution model, preemption, data races.
On 03/09/2011, at 02:12, Brendan Eich wrote:
On Sep 2, 2011, at 3:01 PM, Bradley Meck wrote:
Even worse, libraries must support this, similar to how a few libraries that break when a script in strict mode invokes them but unlike frowned upon features,
Strict mode is a static property of code. Strict callers cannot affect behavior of non-strict callees. So I'm not sure what you mean here. Can you give an example?
we are talking about every feature that a proxy has would have to be guarded for this.
In the browser, DOM and host object precedent means we crossed this bridge a long time ago.
IINM, Node has built-in objects that can do magical host-object-like things too. Has it therefore crossed the same bridge too? That is, couldn't such Node host objects be passed to library code that cannot assume plain-old-native-object non-magical semantics?
Yes, buffers are live too. buffer[i] === buffer[i] may be false sometimes.
On Sep 6, 2011, at 12:12 AM, Dmitry Soshnikov wrote:
On 03.09.2011 3:39, Brendan Eich wrote:
On Sep 2, 2011, at 4:15 PM, John J Barton wrote:
Ok I hope someone creates more tutorial information about generators. I read about them and played around with some examples, but I did not come away thinking positive.
You might start with Dave Herman's async library, task.js:
The code of a complete library is always too complicated IMO to "begin with".
I was not recommending reading the library's implementation! Rather, its API (which requires use of yield -- without generators, such an API is impossible in JS without a compiler).
While I tend to use vocabulary more loosely than those who associate a model or structure to a programming language I am sorry for any confusion.
Actors : en.wikipedia.org/wiki/Actor_model
From my standpoint the idea of event emitters using callbacks the
callbacks are being treated as actors. On a more casual level actors could relate to deferreds / promises / monads / etc. as you wish to call them if you insist on using them as the event's reactor being forcibly the object that has the callbacks associated with them. Perhaps message based programming would have been a better term, however I needed to convey a sense of shared state. In this case the closured variables that a function can concurrently mutate (not in direct parallel due to single threaded nature).
Monads : en.wikipedia.org/wiki/Monad_(functional_programming)
From this article (simplest example is the state monad) we can see
adding dynamic information to a set of calculations. How is this drastically different from a closured variable and a callback (dynamic typing aside)? Perhaps I am confused about this, but it was my understanding that monads need not be their own object aside from the invoker and that they represent a calculation tied to state information. At least that is how I have used them in the past.
Strict Mode Error: Reduced example - jsfiddle.net/zrX5B This is exceedingly rare in need but an example of if you want to support streamline.js and you have a C++ module that uses the event loop, you may need to detect random junk that gets treated as async or blocks, and if it gets blocked you may need to act accordingly.
Node / DOM Prior Art: Just because it is done before does not mean it is encouraged, loved, or even sane. Almost every time someone saves a NodeList in the DOM the first time, they end up searching for WTF is going on. The same is true about cacheing Node buffers related to streaming reads from the file system.
Preemption / Asynchronous point: This relates to the idea of proxy asynchronous behavior not blocking code execution. By that I mean that they lack coroutine like cooperation (via a yield [JS yield behaves a bit different than the definition used here] / callback / etc.) to determine when it is safe to continue the current flow of control for our program. I will refer to blocking asynchronous function proxies as proxies that involve cooperative measures. An example of these would be best:
Assume a Proxy of type ASYNC will do a remote HTTP request whenever a property is accessed / set (lets assume standard REST CRUD). This could be used for some nice database persistence without much effort.
Given an example of a service helper function:
function prepare(service, data) { service.data = data; }
prepare(service1, 'hello world!'); console.log('data loaded!');
If the service is not an ASYNC we know that execution will block as per normal and order will be preserved. If the service is an ASYNC and Asyncchronous Proxies block execution (until cooperation) we also know order will be preserved and the event loop will be blocked during this. If one service is ASYNC and Async Proxies do not block execution we have a problem:
- we do not know exactly when prepare will be done (in case we need cleanup on a failure, particularly brutal if we want to fail if a certain timeout is set, but this actually ends up succeeding)
- the console.log may fire before the data is loaded
- even worse if we are setting variables to determine state
Ok, so lets step back and take a closer look at blocking execution for asynchronous function proxies.
What does this get us:
- Be able to treat an asynchronous function (the handler of the proxy) as if it were synchronous. Very good.
- Blocks the event loop at potentially unforeseen times. Very bad. You cannot allow this in some time critical paths (protocols that require time assurances or abort callbacks on timeout instead of waiting 180s for TCP timeout, etc).
Finally I get to a point of discussion on everyone wanting everything to look pretty, even if it causes these problems, and means everyone must follow the convention.
function DBEventEmitter(adaptor) { var self = this; adaptor.connectionString = 'x'; //have to do this on nexttick, adaptor may be synchronous (in memory) process.nextTick(function() { adaptor.init(); }); }
Now when we move this to a land with Async Proxies, I have no doubt we will see the following:
function DBEventEmitter(adaptor) { adaptor.connectionString = 'x'; }
Looks nicer, but if we need to register events onto the DBEventEmitter, it must be done ... prior to the constructor? Ok, well we will just do a nexttick to allow that, even though we are guaranteed init will succeed by the time the function returns. We could get rid of event registration all together, but for that would mean wrapping all async adaptors in async proxies, that is a lot more trouble than just using a simple function invocation and callback. Even with helpers and enumeration of properties, this could become difficult to automate without adding a separate config definition to see what properties should be considered async in order to provide cooperative concurrency (actually seems to remind me of WSDL where you end up making a type for promises).
Sorry if I sound aggressive, but I have strict opinions about explicit behavior rather than potentially surprising behavior for code aesthetics. If a different invocation syntax was required so that one could tell that something may take time due to being async I would have little issue. I merely do not see much gain in combining two drastically different execution types into a single syntax.
On Fri, Sep 2, 2011 at 4:39 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 2, 2011, at 4:15 PM, John J Barton wrote:
Ok I hope someone creates more tutorial information about generators. I
read about them and played around with some examples, but I did not come away thinking positive.
You might start with Dave Herman's async library, task.js:
The example on that page illustrates my concerns:
spawn(function() { try { var [foo, bar] = yield join(read("foo.json"), read("bar.json")).timeout(1000); render(foo); render(bar); } catch (e) { console.log("read failed: " + e); } });
As far as I understand it, the function passed as an argument to spawn() will be called by spawn(), but it will not execute and spawn() will not receive its return value. Instead, spawn() will get an object with a next() function.
When spawn() calls the next() function, then the function we are read above executes up the the 'yield' keyword; the expression which follows 'yield' is evaluated and returned as the value of the next() function.
When spawn() calls the next() function a second time, the function we read above executes differently. We don't start at the beginning, but after the expression following yield.
As a reader I have to parse the function carefully to look for the 'yield'. If I find one, then I know that this is not actually a function after all. Then I need to mentally draw a line after the yield and label it "entry-point #2".
Next I have to read through the source of spawn() and track the lifetime-of and references-to the object it received, seeking calls to send(). When I find them, the argument will be the result of the yield operation above. Hopefully the next() call will be textually near to the send() call so I can work out which send() matches which yield.
Assuming I am understanding the idea, then my description above is also my criticism: control and data flow is jumping around in unusual ways and functions morph into multi-entry point things with new API.
On the positive side I can now see the advantage of this approach. The I/O request and action on its response are now neatly separated from the spaghetti. I can study and curse at the yield/next/send code, but once I have made that investment the source and sink of foo and bar are clearly related. The funky stuff is in infrastructure not (mostly) in application code.
jjb
On Sep 4, 2011, at 11:06 AM, John J Barton wrote:
As a reader I have to parse the function carefully to look for the 'yield'. If I find one, then I know that this is not actually a function after all. Then I need to mentally draw a line after the yield and label it "entry-point #2".
This is addressed in ES6 by changing the head of the generator function as follows:
spawn(function* () { try { var [foo, bar] = yield join(read("foo.json"), read("bar.json")).timeout(1000); render(foo); render(bar); } catch (e) { console.log("read failed: " + e); } });
Next I have to read through the source of spawn() and track the lifetime-of and references-to the object it received, seeking calls to send().
No, you can use spawn as a black box. It's an API entry point to the taskjs library. You do not need to read the source to use an API.
Assuming I am understanding the idea, then my description above is also my criticism: control and data flow is jumping around in unusual ways and functions morph into multi-entry point things with new API.
There's no free lunch. Either we renounce adding new syntax by which to capture a shallow continuation, which leaves us with the nested callback function world of today, which people object to on several grounds:
-
Nesting makes hard-to-read, rightward-drifting function spaghetti.
-
Nesting increases the risk of capturing too much environment, potentially entraining memory for a long time -- bloat and leak bugs.
These are not unmanageable according to some JS hackers, but they are not trivial problems either. They keep coming up, in the Node community and among browser JS developers. They're real.
On the positive side I can now see the advantage of this approach. The I/O request and action on its response are now neatly separated from the spaghetti. I can study and curse at the yield/next/send code, but once I have made that investment the source and sink of foo and bar are clearly related. The funky stuff is in infrastructure not (mostly) in application code.
Right.
On 03.09.2011 1:26, John J Barton wrote:
I'm pretty puzzled by this discussion and I'm guessing other folks might be puzzled as well. Since I understood node fibers as "thread for Node", the discussion I read is:
/be: You can have threads! Mikeal: We don't want threads!
If I'm on the right track, then I should understand how this relates to proxies. But I don't. Any hints?
Don't be worry :) the topic is a little bit changed. Initially it was
asked how to provide asynchronous property readings (proxy's get
trap,
or a simple accessor's get which call some deferred action) in the
syncronious view. And consuming generators (being a technique of
implementation of cooperative processes) allow to do this, since can
suspend a process and resume it from the next line, i.e. representing
asynchronous in syncronious view.
But, that's said, even more elegant of such a technique would be either
syntactic transformation at compilation level, or creation of implicit
task-wrapper with a sugar for yield
in this case. Regarding not using
yield
for improving asynchronous programming -- I also didn't
understand the reasons. Because currently programming in Erlang (which
have the same "green thread" which are managed by an implicit
scheduler) I can say that Node.js's spagetty-code with nested callbacks
just, sorry, sucks in comparison with Erlang's which is also
asynchronous, but allows to write it all in the synchronous manner.
Though, the topic is not about Erlang...
Dmitry.
On Sun, Sep 4, 2011 at 1:42 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 4, 2011, at 11:06 AM, John J Barton wrote:
As a reader I have to parse the function carefully to look for the 'yield'. If I find one, then I know that this is not actually a function after all. Then I need to mentally draw a line after the yield and label it "entry-point #2".
This is addressed in ES6 by changing the head of the generator function as follows:
spawn(function* () { try { var [foo, bar] = yield join(read("foo.json"), read("bar.json")).timeout(1000); render(foo); render(bar); } catch (e) { console.log("read failed: " + e); } });
(For others who have some trouble seeing the difference, here the function
keyword is followed by the multiplication symbol).
"generator() {}" would make a lot more since to me, it would more clearly alert the reader at no added cost for the writer, as well as helping new users of this feature separate it's effect from functions.
Next I have to read through the source of spawn() and track the lifetime-of and references-to the object it received, seeking calls to send().
No, you can use spawn as a black box. It's an API entry point to the taskjs library. You do not need to read the source to use an API.
Hmm., the doc dherman.github.com/taskjs/doc/api.html says: Task spawn(generator function thunk) But as I understand it, the send() plays a role in the API to spawn correct?
Assuming I am understanding the idea, then my description above is also my criticism: control and data flow is jumping around in unusual ways and functions morph into multi-entry point things with new API.
There's no free lunch. Either we renounce adding new syntax by which to capture a shallow continuation, which leaves us with the nested callback function world of today, which people object to on several grounds:
Nesting makes hard-to-read, rightward-drifting function spaghetti.
Nesting increases the risk of capturing too much environment, potentially entraining memory for a long time -- bloat and leak bugs.
These are not unmanageable according to some JS hackers, but they are not trivial problems either. They keep coming up, in the Node community and among browser JS developers. They're real.
So it is known that no alternative exists beyond these two choices? jjb
On Sep 5, 2011, at 9:36 PM, John J Barton wrote:
On Sun, Sep 4, 2011 at 1:42 PM, Brendan Eich <brendan at mozilla.com> wrote: On Sep 4, 2011, at 11:06 AM, John J Barton wrote:
As a reader I have to parse the function carefully to look for the 'yield'. If I find one, then I know that this is not actually a function after all. Then I need to mentally draw a line after the yield and label it "entry-point #2".
This is addressed in ES6 by changing the head of the generator function as follows:
spawn(function* () { try { var [foo, bar] = yield join(read("foo.json"), read("bar.json")).timeout(1000); render(foo); render(bar); } catch (e) { console.log("read failed: " + e); } }); (For others who have some trouble seeing the difference, here the function keyword is followed by the multiplication symbol).
"generator() {}" would make a lot more since to me, it would more clearly alert the reader at no added cost for the writer, as well as helping new users of this feature separate it's effect from functions.
We considered that, but generator is not reserved, and reserving it in ES6 requires newline sensitivity at least. Consider an anonymous generator similar to the one you typed:
generator() {...}
where ... is actual code (and the {...} style may use multiple lines and indentation). This could be valid code in the field today, relyling on ASI to insert the semicolon after "generator()".
What's more, the precedent in Python is not a negative or a trivial benefit. Besides reusing brain-print, we stand on field-tested design in a similar language. Python uses def for both functions and generators.
Next I have to read through the source of spawn() and track the lifetime-of and references-to the object it received, seeking calls to send().
No, you can use spawn as a black box. It's an API entry point to the taskjs library. You do not need to read the source to use an API.
Hmm., the doc dherman.github.com/taskjs/doc/api.html says: Task spawn(generator function thunk) But as I understand it, the send() plays a role in the API to spawn correct?
The value sent to yield expressions in your generator, yes. So? Again that does not mean you must read the inside of the black box. The API docs may be lacking, of course :-|.
Assuming I am understanding the idea, then my description above is also my criticism: control and data flow is jumping around in unusual ways and functions morph into multi-entry point things with new API.
There's no free lunch. Either we renounce adding new syntax by which to capture a shallow continuation, which leaves us with the nested callback function world of today, which people object to on several grounds:
Nesting makes hard-to-read, rightward-drifting function spaghetti.
Nesting increases the risk of capturing too much environment, potentially entraining memory for a long time -- bloat and leak bugs.
These are not unmanageable according to some JS hackers, but they are not trivial problems either. They keep coming up, in the Node community and among browser JS developers. They're real.
So it is known that no alternative exists beyond these two choices?
On this list, we have been over the design space many times. Please search for call/cc and "shallow continuations" using site:mail.mozilla.org. I'm short on time now.
The upshot is that you're suggesting threads, or deep continuations. Threads suck, we are not exposing JS developers to preemptively scheduled threads in a language with shared-object mutation and therefore significant data races.
Deep continuations have two problems:
-
They break the ability to reason about loss of invariants due to an event loop turn being taken under what looks like a function call.
-
The second problem is that different implementations will not agree on capturing deep continuations across native frames, but interoperation demands one standards. Forbidding capture across native frames breaks abstraction over self-hosted vs. native code (e.g. Array map calling its downward funarg). Requiring will mean that some VMs won't be able to conform to such a spec -- Java, .NET, any without hand-coded magic to deal with compiler magic involving native frame representation.
This leaves shallow continuations, and generators won out in a fair fight with a more minimal "shift" proposal for shallow continuations (too minimal).
Generators share with private name objects a right-sized (minimal but not too small) gap-filling quality that supports all of:
- prompt standardization,
- interoperable implementation, and
- library ecosystem builder upside far beyond what TC39 could ever do on any schedule.
They're in ES6 for good reason.
On 03.09.2011 3:39, Brendan Eich wrote:
On Sep 2, 2011, at 4:15 PM, John J Barton wrote:
Ok I hope someone creates more tutorial information about generators. I read about them and played around with some examples, but I did not come away thinking positive.
You might start with Dave Herman's async library, task.js:
The code of a complete library is always too complicated IMO to "begin with". If you want some small examples with detailed comments and explanations, here they are:
Simple (cooperative) processes with the Scheduler
On Mon, Sep 5, 2011 at 11:01 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 5, 2011, at 9:36 PM, John J Barton wrote: ...
Assuming I am understanding the idea, then my description above is also my criticism: control and data flow is jumping around in unusual ways and functions morph into multi-entry point things with new API.
There's no free lunch. Either we renounce adding new syntax by which to capture a shallow continuation, which leaves us with the nested callback function world of today, which people object to on several grounds:
Nesting makes hard-to-read, rightward-drifting function spaghetti.
Nesting increases the risk of capturing too much environment, potentially entraining memory for a long time -- bloat and leak bugs.
These are not unmanageable according to some JS hackers, but they are not trivial problems either. They keep coming up, in the Node community and among browser JS developers. They're real.
So it is known that no alternative exists beyond these two choices?
On this list, we have been over the design space many times. Please search for call/cc and "shallow continuations" using site:mail.mozilla.org. I'm short on time now.
The upshot is that you're suggesting threads, or deep continuations. Threads suck, we are not exposing JS developers to preemptively scheduled threads in a language with shared-object mutation and therefore significant data races.
Thanks, but as you told me: 'I did not say that. Remember, tbrendaneich.com/2007/02/threads-suckhreads
suck'. I was more thinking along the lines of better support for async programming that does not attempt to look sync (I have no idea what that means).
Deep continuations have two problems:
They break the ability to reason about loss of invariants due to an event loop turn being taken under what looks like a function call.
The second problem is that different implementations will not agree on capturing deep continuations across native frames, but interoperation demands one standards. Forbidding capture across native frames breaks abstraction over self-hosted vs. native code (e.g. Array map calling its downward funarg). Requiring will mean that some VMs won't be able to conform to such a spec -- Java, .NET, any without hand-coded magic to deal with compiler magic involving native frame representation.
This leaves shallow continuations, and generators won out in a fair fight with a more minimal "shift" proposal for shallow continuations (too minimal).
Generators share with private name objects a right-sized (minimal but not too small) gap-filling quality that supports all of:
- prompt standardization,
- interoperable implementation, and
- library ecosystem builder upside far beyond what TC39 could ever do on any schedule.
They're in ES6 for good reason.
Unfortunately we are comparing a tiny number of generator programs written by experts to a unknown fraction of callback programs written by right-marching developers. I think developers will be slow to take up generators, but on the other hand it's an important problem and one worth taking risks to explore.
jjb
On Tue, Sep 6, 2011 at 2:01 AM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 5, 2011, at 9:36 PM, John J Barton wrote:
On Sun, Sep 4, 2011 at 1:42 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 4, 2011, at 11:06 AM, John J Barton wrote:
As a reader I have to parse the function carefully to look for the 'yield'. If I find one, then I know that this is not actually a function after all. Then I need to mentally draw a line after the yield and label it "entry-point #2".
This is addressed in ES6 by changing the head of the generator function as follows:
spawn(function* () { try { var [foo, bar] = yield join(read("foo.json"), read("bar.json")).timeout(1000); render(foo); render(bar); } catch (e) { console.log("read failed: " + e); } });
(For others who have some trouble seeing the difference, here the function keyword is followed by the multiplication symbol).
"generator() {}" would make a lot more since to me, it would more clearly alert the reader at no added cost for the writer, as well as helping new users of this feature separate it's effect from functions.
We considered that, but generator is not reserved, and reserving it in ES6 requires newline sensitivity at least. Consider an anonymous generator similar to the one you typed:
generator() {...}
where ... is actual code (and the {...} style may use multiple lines and indentation). This could be valid code in the field today, relyling on ASI to insert the semicolon after "generator()".
What's more, the precedent in Python is not a negative or a trivial benefit. Besides reusing brain-print, we stand on field-tested design in a similar language. Python uses def for both functions and generators.
But isn't this just a design flaw in python (and js 1.7) -- remedied, no less, by your disambiguating asterisk? Personally I'm fond of the asterisk, and the symmetry of function* and yield*, but this is a distinct grammatical construct from function, right? This argument doesn't strike me as very winful against any other (syntactically safe) spelling.
Next I have to read through the source of spawn() and track the lifetime-of
and references-to the object it received, seeking calls to send().
No, you can use spawn as a black box. It's an API entry point to the taskjs library. You do not need to read the source to use an API.
Hmm., the doc dherman.github.com/taskjs/doc/api.html says: Task spawn(generator function thunk) But as I understand it, the send() plays a role in the API to spawn correct?
The value sent to yield expressions in your generator, yes. So? Again that does not mean you must read the inside of the black box. The API docs may be lacking, of course :-|.
What it's really lacking is a version with * syntax of harmony generators, which fixes the locality of reference problems and makes it more clear when you're working with a task. Any word on when they're intended to land in spidermonkey? I see Dave Herman filed a bug back in June but there's no status on it.
[snipped the rest]
On Tue, Sep 6, 2011 at 12:37 PM, John J Barton <johnjbarton at johnjbarton.com>wrote:
On Mon, Sep 5, 2011 at 11:01 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 5, 2011, at 9:36 PM, John J Barton wrote: ...
Assuming I am understanding the idea, then my description above is also my criticism: control and data flow is jumping around in unusual ways and functions morph into multi-entry point things with new API.
There's no free lunch. Either we renounce adding new syntax by which to capture a shallow continuation, which leaves us with the nested callback function world of today, which people object to on several grounds:
Nesting makes hard-to-read, rightward-drifting function spaghetti.
Nesting increases the risk of capturing too much environment, potentially entraining memory for a long time -- bloat and leak bugs.
These are not unmanageable according to some JS hackers, but they are not trivial problems either. They keep coming up, in the Node community and among browser JS developers. They're real.
So it is known that no alternative exists beyond these two choices?
On this list, we have been over the design space many times. Please search for call/cc and "shallow continuations" using site:mail.mozilla.org. I'm short on time now.
The upshot is that you're suggesting threads, or deep continuations. Threads suck, we are not exposing JS developers to preemptively scheduled threads in a language with shared-object mutation and therefore significant data races.
Thanks, but as you told me: 'I did not say that. Remember, tbrendaneich.com/2007/02/threads-suckhreads suck'. I was more thinking along the lines of better support for async programming that does not attempt to look sync (I have no idea what that means).
I'm also curious what "better support for async programming" looks like --
that always seem to boil down to a wait
construct -- which it's already
been established is not enabled by harmony generators. So AFAICT they do
exactly what you're asking -- provide language level support for libraries
to take async control flow in new directions, all without shared state,
spooky action at a distance, or attempting to "look sync" :)
Deep continuations have two problems:
They break the ability to reason about loss of invariants due to an event loop turn being taken under what looks like a function call.
The second problem is that different implementations will not agree on capturing deep continuations across native frames, but interoperation demands one standards. Forbidding capture across native frames breaks abstraction over self-hosted vs. native code (e.g. Array map calling its downward funarg). Requiring will mean that some VMs won't be able to conform to such a spec -- Java, .NET, any without hand-coded magic to deal with compiler magic involving native frame representation.
This leaves shallow continuations, and generators won out in a fair fight with a more minimal "shift" proposal for shallow continuations (too minimal).
Generators share with private name objects a right-sized (minimal but not too small) gap-filling quality that supports all of:
- prompt standardization,
- interoperable implementation, and
- library ecosystem builder upside far beyond what TC39 could ever do on any schedule.
They're in ES6 for good reason.
Unfortunately we are comparing a tiny number of generator programs written by experts to a unknown fraction of callback programs written by right-marching developers. I think developers will be slow to take up generators, but on the other hand it's an important problem and one worth taking risks to explore.
Indeed. Uptake could be aided by harmony transpilers -- traceur has generator support already (no harmony generators as yet, but in due time). Time will tell how the market moves, but like with anything else, much of the heavy lifting will happen at the library level, and Dave's Tasks are just a whiff of what's possible. The yield* construct opens the door to async iteration idioms that will put old es5 alternatives to shame.
Who knows, perhaps their perceived complexity will prove too great for generators to ever go mainstream. Even if that were the case they're still a useful and practical addition our syntax toolkit.
On Sep 6, 2011, at 9:47 AM, Dean Landolt wrote:
We considered that, but generator is not reserved, and reserving it in ES6 requires newline sensitivity at least. Consider an anonymous generator similar to the one you typed:
generator() {...}
where ... is actual code (and the {...} style may use multiple lines and indentation). This could be valid code in the field today, relyling on ASI to insert the semicolon after "generator()".
Hold this thought.
What's more, the precedent in Python is not a negative or a trivial benefit. Besides reusing brain-print, we stand on field-tested design in a similar language. Python uses def for both functions and generators.
But isn't this just a design flaw in python (and js 1.7) -- remedied, no less, by your disambiguating asterisk? Personally I'm fond of the asterisk, and the symmetry of function* and yield*, but this is a distinct grammatical construct from function, right?
No. That's another point: we do not massively duplicate the grammar so as to rule out yield in function syntactically, and require it in function* (indeed it is not requried -- 0 iteration basis case).
Likewise, ES1-5 do not try to restrict return, break, and continue to contexts where they are legal using grammar only. Instead, the poor-man's operational semantic steps and model, and some prose, combine to do the deeds.
This argument doesn't strike me as very winful against any other (syntactically safe) spelling.
It's the "What's more" argument, on top of the "Hold this thought" main point. Both matter, though. Generator functions are still functions of a kind. So are constructors (which in the current classes proposal are grammatically distinct).
As in Python, which is valuable precedent for those who know it and those who've studied its rationale docs (PEPs), the commonality among function variants is greater than the differences that might lead us on a snipe-hunt for an unreserved keyword to use.
What it's really lacking is a version with * syntax of harmony generators, which fixes the locality of reference problems
"locality of reference" sounds cool, but what do you mean? It suggests some "spooky action at a distance" but there's no such runtime QM-like problem.
If you mean that decorating the head of the generator syntax to telegraph that yield may (not must) occur later is valuable, I agree. But there's no "physics" at work here. A human reader may miss the yield (if there is a yield -- if not, then the * is valuable to make an empty generator, something not possible in Python AFAIK). Likewise, a human reader may miss the * in the head and miss the one or more yields in an interesting generator function.
Readers may miss all sorts of key static dependencies among parts of a larger function or unit of code. There are no silver bullets here.
and makes it more clear when you're working with a task. Any word on when they're intended to land in spidermonkey? I see Dave Herman filed a bug back in June but there's no status on it.
I will stir the pot!
On Sep 6, 2011, at 10:10 AM, Dean Landolt wrote:
On Tue, Sep 6, 2011 at 12:37 PM, John J Barton <johnjbarton at johnjbarton.com> wrote:
I was more thinking along the lines of better support for async programming that does not attempt to look sync (I have no idea what that means).
I'm also curious what "better support for async programming" looks like -- that always seem to boil down to a
wait
construct -- which it's already been established is not enabled by harmony generators.
Hi Dean, I hope you don't mind if I quibble here: generators do enable (as in make possible, but not fully implement) async or deferred functions. You need a scheduler and an event loop concurrency model in addition to generators for the full monte, but generators are co-expressive with the control-effect part of async or deferred functions.
Here is Dave Herman's desugaring write-up:
======================================================================
Library for creating intrinsic deferred objects
function IntrinsicDeferred(generator) { this.state = "newborn"; this.generator = generator; this.callbacks = []; this.errbacks = []; this.completion = null; this.continue(void 0, true); }
IntrinsicDeferred.prototype = { continue: function(value, normal) { if (this.state === "running" || this.state === "finished") throw new Error("illegal state"); this.state = "running"; let received; try { received = normal ? this.generator.send(value) : this.generator.throw(value); } catch (e) { if (isStopIteration(e)) this.callback(e.value); else this.errback(e); return; } let { awaited, callback, errback } = received; awaited.then(callback, errback); return; }, then: function(cb, eb) { if (this.state === "finished") { if (this.completion.type === "return" && cb) cb(this.completion.value); if (this.completion.type === "error" && eb) eb(this.completion.value); return; } if (cb) this.callbacks.push(cb); if (eb) this.errbacks.push(eb); }, createCallback: function(used) { let D = this; return function(value) { if (used.value) throw new Error("cannot reuse continuation"); used.value = true; D.continue(value, true); }; }, createErrback: function(used) { let D = this; return function(value) { if (used.value) throw new Error("cannot reuse continuation"); used.value = true; D.continue(value, false); }; }, await: function(awaited) { this.state = "suspended"; let used = { value: false }; return { awaited: awaited, callback: this.createCallback(used), errback: this.createErrback(used) }; }, cancel: function(value) { if (this.state === "running" || this.state === "finished") throw new Error("illegal state"); this.state = "running"; try { this.generator.close(); } finally { this.errback(value); } }, callback: function(value) { this.state = "finished"; this.completion = { type: "return", value: value }; let a = this.callbacks, n = a.length; for (let i = 0; i < n; i++) { try { let cb = a[i]; cb(value); } catch (ignored) { } } this.callbacks = this.errbacks = null; }, errback: function(value) { this.state = "finished"; this.completion = { type: "error", value: value }; let a = this.errbacks, n = a.length; for (let i = 0; i < n; i++) { try { let eb = a[i]; eb(value); } catch (ignored) { } } this.callbacks = this.errbacks = null; } };
Translation of deferred function <D>:
deferred function <name>(<params>) { <body> } ~~>
function <name>(<params>) { let <D> = new IntrinsicDeferred((function* <name>() { <body> }).call(this, arguments)); return { then: <D>.then.bind(<D>), cancel: <D>.cancel.bind(<D>) }; }
Translation of await expression within deferred function <D>:
await <expr> ~~>
yield <D>.await(<expr>)
Translation of return statement within deferred function <D>:
return <expr>; ~~>
return <expr>;
return; ~~>
return;
======================================================================
Note how the ability to return <expr>; from a generator, the PEP 380 extension written up for harmony:generators, is used by the next-to-last translation rule.
So AFAICT they do exactly what you're asking -- provide language level support for libraries to take async control flow in new directions, all without shared state, spooky action at a distance, or attempting to "look sync" :)
Right!
John: I know of no way to make async code "look sync" without raising the risk of code writers and reviewers missing the preemption points, resulting in lost invariants (data races). This is the main objection to deep continuations that I gave. Explicit syntax -- yield, await, wait, etc. -- is best. What people most object to in function nests are the rightward indentation and the closure entrainment (leak/bloat) hazard.
Generators and libraries built
On Sep 6, 2011, at 9:37 AM, John J Barton wrote:
On Mon, Sep 5, 2011 at 11:01 PM, Brendan Eich <brendan at mozilla.com> wrote: Generators share with private name objects a right-sized (minimal but not too small) gap-filling quality that supports all of:
- prompt standardization,
- interoperable implementation, and
- library ecosystem builder upside far beyond what TC39 could ever do on any schedule.
They're in ES6 for good reason.
Unfortunately we are comparing a tiny number of generator programs written by experts to a unknown fraction of callback programs written by right-marching developers.
The question is whether an explicit keyword prefix operator, yield in the case of harmony:generators, is more winning than the rightward march with its readability and entrainment issues, once generator support is in the field enough. I'm betting yes, but we shall have to see.
I think developers will be slow to take up generators, but on the other hand it's an important problem and one worth taking risks to explore.
The slow uptake is a given until generator support is widespread in the user agents banging on one's site. Some l33t sites may see heavy Chrome and Firefox share already, so in a year may dip their toes in the generator waters. Games already do this, to a fault ("works only in Chrome").
Thanks for the thoughtful comments.
On Tue, Sep 6, 2011 at 2:37 PM, Brendan Eich <brendan at mozilla.com> wrote:
On Sep 6, 2011, at 9:47 AM, Dean Landolt wrote:
We considered that, but generator is not reserved, and reserving it in ES6
requires newline sensitivity at least. Consider an anonymous generator similar to the one you typed:
generator() {...}
where ... is actual code (and the {...} style may use multiple lines and indentation). This could be valid code in the field today, relyling on ASI to insert the semicolon after "generator()".
Hold this thought.
What's more, the precedent in Python is not a negative or a trivial
benefit. Besides reusing brain-print, we stand on field-tested design in a similar language. Python uses def for both functions and generators.
But isn't this just a design flaw in python (and js 1.7) -- remedied, no less, by your disambiguating asterisk? Personally I'm fond of the asterisk, and the symmetry of function* and yield*, but this is a distinct grammatical construct from function, right?
No. That's another point: we do not massively duplicate the grammar so as to rule out yield in function syntactically, and require it in function* (indeed it is not requried -- 0 iteration basis case).
Likewise, ES1-5 do not try to restrict return, break, and continue to contexts where they are legal using grammar only. Instead, the poor-man's operational semantic steps and model, and some prose, combine to do the deeds.
This argument doesn't strike me as very winful against any other (syntactically safe) spelling.
It's the "What's more" argument, on top of the "Hold this thought" main point.
I didn't see this line of reasoning -- I thought your main point was that
generator
was grammatically ambiguous. My apologies if this was so obvious
as to be implied, but in any event, it was a trivial little nit begging for
a stronger rationale, which was delivered and then some.
Both matter, though. Generator functions are still functions of a kind. So
are constructors (which in the current classes proposal are grammatically distinct).
As in Python, which is valuable precedent for those who know it and those who've studied its rationale docs (PEPs), the commonality among function variants is greater than the differences that might lead us on a snipe-hunt for an unreserved keyword to use.
What it's really lacking is a version with * syntax of harmony generators, which fixes the locality of reference problems
"locality of reference" sounds cool, but what do you mean? It suggests some "spooky action at a distance" but there's no such runtime QM-like problem.
If you mean that decorating the head of the generator syntax to telegraph that yield may (not must) occur later is valuable, I agree.
Yes, I was referring to those sneaky, hidden preemption points where
variables change out from under you. This is helped by the * head, but IMHO
the deeper problem with libraries like node-fibers that Mikeal was griping
about is rooted in a dynamic wait
call -- preemption that could aliased or
buried deep in a call stack. AFAICT generators completely prevent this --
preemption cannot happen without an explicit, static yield
prefix right at
the call site. This is really where "spooky action at a distance" shows up,
and to be fair, python and js 1.6 generators are already immune to it.
But there's no "physics" at work here. A human reader may miss the yield (if
there is a yield -- if not, then the * is valuable to make an empty generator, something not possible in Python AFAIK).
Yes, a human reader could miss yield -- easier than one could miss a nested
callback function (barring the acceptance of block lambda) -- but it cannot,
under any circumstances, be hidden from view. Unlike a wait
function.
Likewise, a human reader may miss the * in the head and miss the one or
more yields in an interesting generator function.
Readers may miss all sorts of key static dependencies among parts of a larger function or unit of code. There are no silver bullets here.
and makes it more clear when you're working with a task. Any word on when
On Sep 6, 2011, at 1:07 PM, Dean Landolt wrote:
I didn't see this line of reasoning -- I thought your main point was that
generator
was grammatically ambiguous. My apologies if this was so obvious as to be implied, but in any event, it was a trivial little nit begging for a stronger rationale, which was delivered and then some.
No problem. I should number my independent arguments, anyway.
Yes, I was referring to those sneaky, hidden preemption points where variables change out from under you. This is helped by the * head, but IMHO the deeper problem with libraries like node-fibers that Mikeal was griping about is rooted in a dynamic
wait
call -- preemption that could aliased or buried deep in a call stack. AFAICT generators completely prevent this -- preemption cannot happen without an explicit, staticyield
prefix right at the call site. This is really where "spooky action at a distance" shows up, and to be fair, python and js 1.6 generators are already immune to it.
100% agreed -- the aliasing possibility is particularly bad. We tried to tame eval as a first-class function by making calls via certain kinds of aliases either errors (ES5 strict and up) or indirect evals, but even indirect eval is narsty. No more like this, especially not for a preemption priimitive.
I'm pretty puzzled by this discussion and I'm guessing other folks might be puzzled as well. Since I understood node fibers as "thread for Node", the discussion I read is:
/be: You can have threads! Mikeal: We don't want threads!
If I'm on the right track, then I should understand how this relates to proxies. But I don't. Any hints?
jjb