JS control-structure abstractions, using tailnest flattening and tailcall optimization
On Jul 8, 2011, at 7:34 AM, Claus Reinke wrote:
Tailcall optimization guarantees that the callbacks will not overflow the runtime stack, and tailnest flattening keeps the level of syntactic nesting independent of the chain length.
ES.next has generators and proper tail calls already. I'm hard-pressed to call the syntax extensions you propose 'more readable" than what can be done with generators and libraries such as taskjs (dherman/taskjs). I have to say @< and expression-forms seem less readable in general.
Tailcall optimization guarantees that the callbacks will not overflow the runtime stack, and tailnest flattening keeps the level of syntactic nesting independent of the chain length.
ES.next has generators and proper tail calls already. I'm hard-pressed to call the syntax extensions you propose 'more readable" than what can be done with generators and libraries such as taskjs (dherman/taskjs). I have to say @< and expression-forms seem less readable in general.
'more readable' wrt current function definition and call syntax, whose problems you know very well from your own proposals.
The tailnests.html page in the repo allows you to see sugared and desugared source side-by-side, making the readability improvement obvious - the plain ES version only adds noise.
The current ES syntax is so bad that we are looking at extending the semantics of ES (with generators, with block lambdas) just to give programmers features that they should be able to define in ES already (the only necessary semantic change are proper tail calls).
Generators offer a good cost-benefit ratio, and I'm not even against block-lambdas - they both provide useable syntax for interesting use cases. At least as long as the language is still awkwardly biased towards imperative code ("awkward" because this goes against its prototypical/functional nature).
But you cannot go on adding special language features for every use case: block-lambdas have new features, but as far as control structure abstraction go, they do no help to implement yield (I think?), so we add yield as well; and if (the form of) yield we add is not sufficient for the next use case, then we add ..?
I'd like to get out of this loop where every interesting feature requires a language extension, and move on to being able to define new language features in the language. So I think it would be a serious mistake to use special features as an argument against improving the usability of the core language.
By core language improvements, I mean function definition and call improvements to match the tail call optimization semantic improvement. Function definition improvements are what I thought arrow syntax was about, but that seems stuck, so I suggested a less ambitious alternative. Function call improvements are addressed by a single infix application operation (you know what I think about having to discuss every new infix operation in the language committee).
Also, since you mention task.js: task.js uses generators, my example showed how to implement generators - one coin, different sides. However, while the idea of using generators to emulate non-preemptive threads is good, using the same idea to handle callback chains is overkill. Writing
spawn(function() { var result = yield operation(..); ..operations.. })
uses built-in generators as delimited continuations, just to get a handle on the variable binding context, when the same can be achieved more directly, in current ES syntax:
operation(..).then( function(result) { .. operations.. } )
In the former variant, 'spawn' can stitch the continuations together asynchronously, if 'operation' returns something like a promise, and in the latter variant, '.then' can do the same, without using the built-in generator machinery (also, generators are objects with internal state, not easily copied; this could be a disadvantage when trying to extend the control structure pattern to similar use cases).
Yes, agreed, programmers have been trained to read their variable definitions against the left-to-right text flow, so we'll want some additional 'let-then' sugar for the above, and then
let result <- operation(..); ..operations..
would desugar (preferably without 'this'/'arguments') to
operation(..).then( function(result) { .. operations.. } )
(this is also called for by Tennent's correspondence principle: equal rights for declarative and parametric variable bindings).
But that is still just sugar, plus an interface against which the code is desugared. By providing different implementations of that interface (then-ables), we can reuse and augment the code, for instance in asynchronous operations callback chains, or with user-defined generators. The beauty of coding against an interface means that we could hook in different implementations.
In my example, I just wanted to make sure that this interface is sufficient for the kind of control structure abstractions under discussion. This highlighted three issues: abstraction over References is difficult in ES, abstraction just to delay evaluation wants very minimal syntax, and without types, implementations will have to be selected explicitly. But the then-able interface itself seems to be sufficient.
Btw, I wonder what current JS engines can do with the generator- based approach to async code: do you think it is going to be as efficient as direct-style async code? That is, are tracing JITs or other current optimizations good enough to eliminate the built-in-generator overhead?
Claus
Dear all,
we have seen examples of how to define control-structure abstractions via block-lambdas (and Smalltalk blocks), including non-local returns out of user-defined loops.
I'd like to provide an example of how to do something like this with JS, using only the proposed syntactic sugar for flat syntactic tailnests and the upcoming support for flat runtime tailcall stacks (for callback chains, the two complement each other).
Building up from small structures, the example aims to implement a variant of "yield". With conventional blocks, this requires the upcoming generators language extension. Block-lambdas do not help here, I think, because yield can suspend and resume at statement level, and even the slightly unusual sequel feature does only support non-local return, no resume (calling a sequel returns to a lexically scoped elsewhere, not to the caller).
If JS had call/cc or delimited continuations, we could implement yield on top of that, but that isn't likely. If we were to rewrite our code in continuation passing style (cps), we could implement call/cc, but even in languages with lightweight function syntax, that is not considered very readable. There are modular variants of continuation passing style that can be made to look very similar to "normal" code, but these become unreadable in JS syntax. Cps without tailcall optimization also quickly overflows the stack.
Which is where tailnests (readability) and tailcalls (efficiency/ useability) come in, addressing the issues that cps in current JS leads to overflows in syntactic and runtime stack nesting.
First, the programming pattern: in plain cps, we'd have nested callback chains feeding intermediate results into the rest of the callback chain (assuming currying for the callback parameter):
With paren-free right-associative calls (f @< arg) and brace-free definitions (function(arg) => expr) for expression functions, this
becomes:
Tailcall optimization guarantees that the callbacks will not overflow the runtime stack, and tailnest flattening keeps the level of syntactic nesting independent of the chain length.
In modular cps, instead of every operation taking a callback parameter, every operation returns its result in an object implementing a single method 'then' (the ideas are standard, the translation to JS is not). Calling 'then' with a callback feeds a value/intermediate result to the callback.
(which can be read as a clumsy way to write "let result <- operation in operations", but the ".then" allows us to give different meanings to the variable binding, eg., it could mean "foreach result from operation, do operations", or it could store the continuation and jump elsewhere, as yield)
The most basic operation is 'value', which just wraps a value into a "then-able":
Since statements represented as then-ables are objects, conditional statements are just conditional expressions:
or, if the condition involves then-able statements, too:
which we can either use directly, or wrap in a function (the then/else-branch arguments are themselves wrapped to delay their evaluation):
Hence, a while-loop with then-able condition and delayed body:
You can find these examples online [1], so to limit this email, I'll jump right to the interesting bit, which is implementing yield:
"yield_(val)" produces a then-able that behaves a bit unusually: instead of passing a value to continuations passed via ".then()", it swallows all such continuations, storing them for later use (in plain cps, we'd have the whole continuation at hand; in this modular cps variation, we have to collect nested continuations).
To get at the yielded value, we can call "yield_(val).value(cont)", and to restart the "generator", we can call "yield_(val).next()", which applies the internally stored continuations to "val". In code:
Again, the commented code and a couple of example generators can be found online [1]. If you use the desugaring page, you can see why this style isn't popular without syntactic sugar. But the functionality is plain JS, no semantic changes!
The code could be made more readable by providing "let <-"-desugaring to then-ables, but JS is different enough from other languages providing similar features that I want to make sure that this pattern is the one to aim for before proposing that. The code could be shortened further by combining arrow-syntax with "@<", instead of the less ambitious expression functions of the tailnests proposal. And extended object-literals would shorten inline then-ables.
Claus
[1] clausreinke/jstr/tree/master/es-discuss