performance of OO dispatch in inner loops
I'm not sure about "necessary" but for "useful", see for instance Practical Soft Typing, by Andrew Wright. Performance analysis for softly typed Scheme show that, on the benchmark, 20% to 50% of the time of the execution was spent performing run-time checks which may be automatically removed.
Cheers, David
Quoting ToolmakerSteve98 <toolmakersteve98 at shawstudio.com>:
ToolmakerSteve98" wrote:
"Brendan Eich" wrote:
Why do you believe static typing is necessary for performance? Just curious.
I woke up in the middle of night, remembering one detail we hit in 1992. Advanced OO programming styles lead to inner loops that contain (potentially subclassed) method dispatches. THAT'S the bottleneck, not the speed of arithmetic computations. Doesn't JIT well, because each "inner loop" (code bottleneck involving repetition) is a cooperative effort between multiple objects, any one of which might show up as a different subclass in the next loop iteration.
Not awake enough right now to think through whether more modern JITs (tracing) successfully unravel this case. If so, then it is a matter of careful cache design. If not, then I think this is where static typing (or more precisely, constructing an object knowing that it has given interfaces; in theory this could be done for dynamically typed objects if you can determine at construction that it satisfies the interface) makes possible higher performance execution than dynamic typing.
More bad news for dynamic typing as I remember the details more clearly: turns out the quentessential activity in the complex control logic we were seeing for interactive multimedia, is that one object queries another object for one or two properties, does a tiny bit of decision making using that info and its own properties, modifies its state a bit, then fires a message off elsewhere. Performance was dominated by how fast you could get at properties on yourself and on the objects you were cooperating with. The better you could pin down ahead of time how to get hold of those properties, the better your performance.
~TMSteve
On Mar 21, 2008, at 5:32 AM, ToolmakerSteve98 wrote:
More bad news for dynamic typing as I remember the details more
clearly: turns out the quentessential activity in the complex control logic
we were seeing for interactive multimedia, is that one object queries
another object for one or two properties, does a tiny bit of decision making using
that info and its own properties, modifies its state a bit, then fires a
message off elsewhere. Performance was dominated by how fast you could get at properties on yourself and on the objects you were cooperating
with. The better you could pin down ahead of time how to get hold of those
properties, the better your performance.
This is a valid concern, about the classic OOP folly where lots of
little methods suffer high dynamic dispatch overhead. But there is at
least 15 years of research on how to optimize better, going back to
the Self and Strongtalk [1] work. More recent work we are supporting
at UC Irvine [2] shows a lot of promise, and it's feeding into
Tamarin Tracing.
Again, I'm not writing checks I can't yet cash (but we are investing,
making bets in code under development). On the other hand, IMHO you
are pessimizing for the two-decade-old trailing edge. It's easy to go
from "want speed" to "require static types" and bypass the "optimize
dynamic types enough not to need more speed" state, but taking this
shortcut risks leaving lots of developers and content behind. It's a
huge rewrite tax at the limit, unpayable on the web.
What's more, if the static type system is weak, this course may just
amount to requiring programmers to overspecify their code for the
sake of VM implementors, without gaining much in the way of safety
properties or other programming in the large benefits. Such
overspecification can turn code into "hardware" that is hard to
evolve quickly, mash up, merge and diverge, reuse at Web scale, etc.
I'm certainly on the edge of over-generalizing here, but I don't
believe this is just hand-waving. You can see the problem, highly
relevant to ES4, in the swerve toward classical OOP a la Java 1 that
AS3 and Flex took targeting the Flash Player. It's not "bad", and for
some programmers and use cases it works great. On the other hand, the
community did grumble about "turning the language into Java", did
over-annotate types without winning too often, and still ended up
wanting dynamic language features that were underserved or left out.
/be
[1] www.strongtalk.org [2] hotpath.org
On Fri, 2008-03-21 at 12:42 -0700, Brendan Eich wrote:
Again, I'm not writing checks I can't yet cash (but we are investing,
making bets in code under development). On the other hand, IMHO you
are pessimizing for the two-decade-old trailing edge. It's easy to go
from "want speed" to "require static types" and bypass the "optimize
dynamic types enough not to need more speed" state, but taking this
shortcut risks leaving lots of developers and content behind. It's a
huge rewrite tax at the limit, unpayable on the web.
I may be wrong but I have the feeling that the works you quote use type-feedback and that type-feedback only optimizes [single] method dispatch. ES4 has multi-methods, which are harder to optimize, plus there are plenty of other checks that may be optimized away which are related to typing but not to method dispatch, say pattern-matching, "wrap", etc. In addition, iirc, type-feedback only works after the first execution of code, so it is not necessarily useful when objects are re-created at each execution of the body of a loop, etc.
Now, of course, I might be wrong. I'm not really a specialist in OOP implementation.
What's more, if the static type system is weak, this course may just
amount to requiring programmers to overspecify their code for the
sake of VM implementors, without gaining much in the way of safety
properties or other programming in the large benefits. Such
overspecification can turn code into "hardware" that is hard to
evolve quickly, mash up, merge and diverge, reuse at Web scale, etc.
I concur with that. In order to maintain readability, most optimizations should happen behind-the-scene i.e. by using either type inference (or something more complicated but of the same kind) or type-feedback.
On Mar 21, 2008, at 1:31 PM, David Teller wrote:
I may be wrong but I have the feeling that the works you quote use type-feedback and that type-feedback only optimizes [single] method dispatch. ES4 has multi-methods, which are harder to optimize, plus there are plenty of other checks that may be optimized away which are related to typing but not to method dispatch, say pattern-matching, "wrap", etc. In addition, iirc, type-feedback only works after the
first execution of code, so it is not necessarily useful when objects are re-created at each execution of the body of a loop, etc.
You really should read the papers, and Andreas Gal's blog. Runtime
types are only one of many kinds of information available to runtime
optimizers. Tracing loops allows hoisting and even allocation
elimination, using escape analysis. Common sub-expression elimination
can consider trace-invariant expressions. One can guard all sorts of
on-trace assumptions and compensate to the interpretr, re-tracing
other hot paths. It's not just about types.
Having written this, there are open issues with tree folding to avoid
proliferation of paths in the trace tree. Again, see http://
andreasgal.com/ for more.
From: "Brendan Eich" <brendan at mozilla.org> [responding to David Teller's post in response to some questions I raised]
You really should read the papers, and Andreas Gal's blog. Runtime types are only one of many kinds of information available to runtime optimizers. Tracing loops allows hoisting and even allocation elimination, using escape analysis. Common sub-expression elimination can consider trace-invariant expressions. One can guard all sorts of on-trace assumptions and compensate to the interpretr, re-tracing other hot paths. It's not just about types.
Thanks Brendan -- useful comments and pointers to further info.
What kind of processing power / memory do these analytic techniques use?
The fundamental question I want to ask (in my ignorance) is to what degree ES4 "raises the bar" on what a minimal web client can be.
I am partially reassured in hearing from those who are working towards mobile implementations, but I still have a gut sense that this is envelope-pushing stuff.
I'm really glad this work is done, but until it is proven, I hesitate to promote the notion that web citizenship [for client devices] should be dependent on all this. Any other comparable project I've seen, this would be considered "R & D" phase.
Don't misunderstand me; I believe ES4 will come to fruition, and that it, as the next version of the language of the web, will have massive positive impact. I want to do my part to help this be a success, and in particular to help it fit well with stuff that Microsoft and Adobe are doing, as everything I do is dependent on those companies' tools and technologies.
~TMSteve
From: "David Teller" wrote:
I concur with that. In order to maintain readability, most optimizations should happen behind-the-scene i.e. by using either type inference (or something more complicated but of the same kind) or type-feedback.
I totally agree with this; I wouldn't even have brought up the topic of static-typing, if it weren't for current state-of-the-art in type inference.
And type-inference is a key part in what has me realize that what I want can be achieved in the authoring tools. If I want to insure that what is sent to a browser is guaranteed to run within some spec [on devices that are in cooperation on this goal], then all I need [with to the topic of dynamic/static typing] is that the language does have an adequate type system; tools can add type annotations (and alert authors to other issues), and the devices can take advantage of that information. Other authors and other devices can do whatever they do; I don't need to care any more.
I am much more comfortable knowing that these results are not dependent on the cutting-edge of dynamic language performance. The whole topic has become a non-issue in my mind now -- I will examine what has been said so far about the type system proposed for ES4.
~TMSteve
On Fri, 2008-03-21 at 13:45 -0700, Brendan Eich wrote:
You really should read the papers, and Andreas Gal's blog.
Probably. I'm not an expert in [dynamic] optimisations, my line of work is mostly related to static analysis. Do you suggest any specific papers other than the blog ?
Runtime
types are only one of many kinds of information available to runtime
optimizers. Tracing loops allows hoisting and even allocation
elimination, using escape analysis. Common sub-expression elimination
can consider trace-invariant expressions. One can guard all sorts of
on-trace assumptions and compensate to the interpretr, re-tracing
other hot paths. It's not just about types.
I'm quite aware that there's more to optimisation than removing dynamic type checks & consolidating dynamic method dispatch. I was just answering that particular point.
Having written this, there are open issues with tree folding to avoid
proliferation of paths in the trace tree. Again, see http:// andreasgal.com/ for more.
Thanks for the reference.
David Teller wrote:
ES4 has multi-methods, which are harder to optimize,
Multi-methods are gone from ES4.
Waldemar
On Mar 26, 2008, at 11:13 AM, Waldemar Horwat wrote:
David Teller wrote:
ES4 has multi-methods, which are harder to optimize,
Multi-methods are gone from ES4.
This is based on the row (numbered 109) of red interrupted so far
only by Mozilla's green at
spreadsheets.google.com/ccc?key=pFIHldY_CkszsFxMkQOReAQ
It's unrealistic to spend time on multimethods in the face of this
red-colored rejection, but I wanted to point out that multimethod
dispatch has been researched quite a bit in the '90s. See
doku.php? id=proposals:generic_functions#background_material
was [Re: Any discussion of "compact" subset for mobile devices?]
"Brendan Eich" wrote:
I woke up in the middle of night, remembering one detail we hit in 1992. Advanced OO programming styles lead to inner loops that contain (potentially subclassed) method dispatches. THAT'S the bottleneck, not the speed of arithmetic computations. Doesn't JIT well, because each "inner loop" (code bottleneck involving repetition) is a cooperative effort between multiple objects, any one of which might show up as a different subclass in the next loop iteration.
Not awake enough right now to think through whether more modern JITs (tracing) successfully unravel this case. If so, then it is a matter of careful cache design. If not, then I think this is where static typing (or more precisely, constructing an object knowing that it has given interfaces; in theory this could be done for dynamically typed objects if you can determine at construction that it satisfies the interface) makes possible higher performance execution than dynamic typing.
~TMSteve