extension modules
On 13 Jun 2009, at 16:10, kevin curtis wrote:
Python has a concept 'extension modules' where module can be implemented in c/c++. Also the idea of running native code in the browser has been put forward by Google's native client for running x86 in the client. MS - i think - are researching something similar.
c/c++ isn't going anywhere and the relationship between ecmascript and c/c++ is interesting. Are there any proposals for something like 'extension modules' for ES6 or do the variations in the engine implementations preclude such a thing?
Not yet part of the ES harmony/6 proposal, but a number of the
'members' of ServerJS movement have native modules:
flusspferd.org, kenai.com/projects/gpsee, wiki.mozilla.org/ServerJS/Modules/SecurableModules
kevin curtis wrote:
Python has a concept 'extension modules' where module can be implemented in c/c++. Also the idea of running native code in the browser has been put forward by Google's native client for running x86 in the client. MS - i think - are researching something similar.
The idea of running native code securely in the browser is speculative and unproven. Nothing should be standardized in this area unless and until such approaches can be demonstrated to have a reasonable chance of resisting attack. To do so would be to repeat previous mistakes that have led to the insecure web we currently have.
c/c++ isn't going anywhere and the relationship between ecmascript and c/c++ is interesting. Are there any proposals for something like 'extension modules' for ES6 or do the variations in the engine implementations preclude such a thing?
As far as a foreign function interface for non-web uses of JavaScript is concerned, that is something that might in principle be worth standardizing (probably separately from ES6).
However, the internal C/C++ interfaces typically used by current JS implementations are highly error-prone, make too many assumptions about implementation details (particularly memory management), and are not suitable for wider use.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Jun 14, 2009, at 6:45 AM, David-Sarah Hopwood wrote:
kevin curtis wrote:
Python has a concept 'extension modules' where module can be implemented in c/c++. Also the idea of running native code in the browser has been put forward by Google's native client for running
x86 in the client. MS - i think - are researching something similar.The idea of running native code securely in the browser is speculative and unproven. Nothing should be standardized in this area unless and until such approaches can be demonstrated to have a reasonable chance of resisting attack. To do so would be to repeat previous mistakes
that have led to the insecure web we currently have.
Right, but that doesn't mean folks authoring C/C++ extensions might
not be able to work w/ a single IDL and set of object lifecycle
guarantees. Every browser in the planet does something similar here in
order to hook up C/C++ DOM objects with the JS VM, and witnessing how
V8 and JSC have had to approach this, it seems natural that JS (say,
on the server) would be well served by some sort of unified-ish way of
saying "here's how my C++ class can be hooked into a JS object
lifecycle". The NPAPI example shows that it can be done at runtime,
and the stuff in good interpreters today (getters, setters, etc.) are
sufficient to facade enough of the API surface area to make these
things happen. What's missing is something that's slightly more
efficient than NPAPI, which both implies a browser runtime, a separate
process (for reliability), and a chatty protocol that thunks through
twice for every method call.
True - native client is currently research. Native Client - quoting the Native Client blog - "aims to give web developers access to the full power of the client's CPU while maintaining the browser neutrality, OS portability and safety". Efficient access by developers to multi-core and graphics chips for algorithm heavy code :)
So, an issue is how to make the functionality of these multi-core and graphics chips available to ecmascript. One way is for ecmascript 'extension modules' to use the Native Client functionality.
Alternatively, ecmascript in the tracemonkey, v8 and nitro generates x86/ARM code. Maybe there could be a ecmascript subset which generates fast machine code for algorithmic intensive code. ie ignores global object and prototype and is maybe typed. A subset similar to - or even the same as - what Douglas Crockford is proposing for secure ecmascript here: ses:ses
Even if 'extension modules' in the browser aren't available in ES6 to third party developers it could be useful for modules that need to be implemented in c/c++ to conform to some set of standards. These standards could be leveraged by users of ecmascript on the server and elsewhere outside of the browser where security is not such an issue. A new math module with a full set of math functions would be a good example - perhaps the code could be shared across the nitro/v8/tracemonkey engines as a proof of concept.
On Jun 22, 2009, at 3:46 PM, kevin curtis wrote:
Alternatively, ecmascript in the tracemonkey, v8 and nitro generates x86/ARM code. Maybe there could be a ecmascript subset which generates fast machine code for algorithmic intensive code. ie ignores global object and prototype and is maybe typed.
The engines you mention all accept the full language, no subsetting
required. Using a subset may help performance, but it's not required.
I'm not sure how this is relevant to using multiple cores, though,
since that would seem to require, at the least, a way to spawn worker
code to run on each core. See Web Workers.
A subset similar to - or even the same as - what Douglas Crockford is proposing for secure ecmascript here: ses:ses
No types there.
If JS is fast enough (with advisory subset usage at the programmer's
discretion) that one does not need to write native code for
performance, then you're right, there might be API or data flow
reasons to want machine int and other low-level types.
But C-like ints that wrap around are not safe for broad usage on the
web. This suggests an "unsafe" or "unmanaged" dialect, which is
something we've thought about at Mozilla when self-hosting built-in
methods. This is a far cry from SES!
Even if 'extension modules' in the browser aren't available in ES6 to third party developers it could be useful for modules that need to be implemented in c/c++ to conform to some set of standards. These standards could be leveraged by users of ecmascript on the server and elsewhere outside of the browser where security is not such an issue. A new math module with a full set of math functions would be a good example - perhaps the code could be shared across the nitro/v8/tracemonkey engines as a proof of concept.
Perhaps. But the details of how the engine calls the native code, and
how the native code might call back into the engine, differ
significantly among those implementations. What you are describing
sounds very much like the JNI.
The requirement of security and speed don't always coincide! The subset/dialect idea is interesting.
The tracemonkey, v8 and nitro engines have the ability to compile to machine code. How about:
'performance modules' which would be written in and parsed as a subset of ES which is extremely sympathetic to the machine code generating infrastructure of the ecmascript engines and thus generates efficient machine code. Without being to closely tied to (specific) implementations.
For example:
module myfastmod; use performance; // pragma that indicates this is perf module
let x:int = 0 // ES6 type annotation to indicate this will be turned into a c int at some point ... etc
Code in a performance module that does not interact with the ES VM via module public api's should be able to be compiled particularly efficiently.
Also, performant c++ code seems to use templates rather than traditional OO with virtual methods. Nitro/v8/tm seem to doing a form of dynamic templating when at runtime they try to figure out the types that are being passed as parameters to functions and generate machine code. (Or in hot loops in tm's case). Maybe the engines could be given a helping hand via the ES subset in perf modules - and type annotations.
But too many subsets could add confusion. Type annotations are in ES6 (i think). Could there be a subset that meets both security and performance issues. Or even perf as a subset of secure. e.g: use secure, perf. Would a performance subset need to be unsafe/unmanaged.
On Jun 23, 2009, at 10:03 AM, kevin curtis wrote:
The requirement of security and speed don't always coincide! The
subset/dialect idea is interesting.
It's two-edged.
Adding standard subsets leads to case-analysis explosion, a recipe for
bugs and reduction of interoperability.
How subsets might evolve in future editions is another axis for
explosion of cases. Are subsets partially ordered in the same way
across future editions? Or should there be a total order? Can strict
mode get stricter over time?
ES5 ---------------> ES6 ...
^ ^
| |
ES5 strict <---?---> ES6 strict ...
The arrowhead points to the superset. At some point we might have ESn
remove ancient cruft so the horizontal arrows might stop pointing
rightward.
These are good questions and there may be compelling and simple
answers, but only for very few standard subsets. If you can't draw the
lattice, it's too complicated.
Ideally, IMHO, the only standard subset will be strict mode.
let x:int = 0 // ES6 type annotation to indicate this will be turned
into a c int at some point ... etc
We do not want int "for performance" if it auto-widens to double.
Adobe has experience here, and int as an annotation on local variables
(e.g. loop controls) is often a de-optimizer, yet users over-use it
"for speed".
If as you propose ("C int", capital C meaning the C language, I take
it) we enable 32-bit machine int under a pragma, we'll have wraparound
bugs on the web. (People will copy and paste the pragma to excess.)
Because of the web-as-it-is, implementors have had to optimize JS as
it is used.
But this has demonstrated, to me at least, that the important language
optimizations can be done well under the hood, without hinting. IMHO
this is a good use of human capital, compared to the alternative of
unleashing pragmas and machine types on the web developer masses,
where the pragmas and types add complexity and often bite back.
The issue with "self-hosting" or "systems programming" goes beyond a
machine int type, however. One would want packed structs that can be
stack allocated and embedded in arrays (no references to heap-
allocated objects). One would surely want flat vectors, not prototype-
delegating hashmap-happy Arrays.
You may be right that this sauce for the goose would be wanted by the
web-dev gander soon enough, but not all at once and prematurely
standardized.
Such a variant of JS could be made memory safe, but it is overkill at
this stage for the Web. The best way to proceed is for Mozilla, e.g.,
to prototype such a language. We're thinking seriously about it right
now.
It's not an Ecma TC39 agenda item until years from now, when such
prototypes have been deployed and used at scale, but in domain-
specific silos.
We'll do our development, if we do it, of such a systems-programming
dialect of JS, in the open and in open source, so everyone on this
list who is interested can watch and participate. But it would be a
distraction to overuse this list for discussions about such a dialect.
Also, performant c++ code seems to use templates rather than
traditional OO with virtual methods. Nitro/v8/tm seem to doing a
form of dynamic templating
Goes back to the Self work in the 90s, of course. No type annotations.
when at runtime they try to figure out the types that are being
passed as parameters to functions and generate machine code. (Or in
hot loops in tm's case).
TM infers static types on trace, this is a difference from the method-
based speculative approaches. That is, we inline aggressively, so type
annotations on parameters to otherwise generic methods could frustrate
it.
Maybe the engines could be given a helping hand via the ES subset in
perf modules - and type annotations.
This is a malinvestment for the masses, with too much blowback
potential.
But too many subsets could add confusion.
I agree -- sorry, I should have read ahead, but you seemed not to see
this earlier.
Type annotations are in ES6 (i think).
Not yet, and not as anything that resembles static machine types.
Could there be a subset that meets both security and performance
issues.
Why are you mixing the two still? As you note they conflict sometimes.
SES is an experiment, not ready for standardization. I don't think a
"PES" can be supported by Ecma TC39 at this point, even if that were
the place. But again, it's not: implementors need to experiment, in
the open but ideally in more than one lab, with more than one approach.
Or even perf as a subset of secure. e.g: use secure, perf. Would a
performance subset need to be unsafe/unmanaged.
On the web, definitely safe/managed (memory-safe at the least!).
In systems programming domains where C++ is used, possibly not. It
depends on the domain.
For Mozilla's systems-ish domains, we would want memory safety,
control flow integrity, and other properties to be enforced. But we
would be willing to take advantage of static/dynamic analysis duality,
and spend more time on static analysis to achieve memory safety and
other properties, with the benefit of lower runtime overhead.
On Jun 17, 2009, at 10:28 AM, Alex Russell wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Jun 14, 2009, at 6:45 AM, David-Sarah Hopwood wrote:
kevin curtis wrote:
Python has a concept 'extension modules' where module can be implemented in c/c++. Also the idea of running native code in the browser has been put forward by Google's native client for running
x86 in the client. MS - i think - are researching something similar.The idea of running native code securely in the browser is
speculative and unproven. Nothing should be standardized in this area unless and until such approaches can be demonstrated to have a reasonable chance of resisting attack. To do so would be to repeat previous mistakes
that have led to the insecure web we currently have.Right, but that doesn't mean folks authoring C/C++ extensions might
not be able to work w/ a single IDL and set of object lifecycle
guarantees. Every browser in the planet does something similar here
in order to hook up C/C++ DOM objects with the JS VM, and witnessing
how V8 and JSC have had to approach this, it seems natural that JS
(say, on the server) would be well served by some sort of unified- ish way of saying "here's how my C++ class can be hooked into a JS
object lifecycle". The NPAPI example shows that it can be done at
runtime, and the stuff in good interpreters today (getters, setters,
etc.) are sufficient to facade enough of the API surface area to
make these things happen. What's missing is something that's
slightly more efficient than NPAPI, which both implies a browser
runtime, a separate process (for reliability), and a chatty protocol
that thunks through twice for every method call.
For what it's worth, we probably wouldn't use a standardized interface
to hook up DOM objects to the JS VM in WebKit, even if we provided it
for third-party clients. We have a few public interfaces for this
(such as NPRuntime) but for objects implemented by the browser, we
want the freedom to change the interface at any time, in case we come
up with a higher-performance way of doing it. Our internal interfaces
also make use of inlining in a way that would not be appropriate for a
public API.
I don't know if this significantly affects the value of your idea or
if you were even proposing this sort of thing.
, Maciej
We'll do our development, if we do it, of such a systems-programming dialect of JS, in the open and in open source, so everyone on this list who is interested can watch and participate. But it would be a distraction to overuse this list for discussions about such a dialect.
Sounds very cool. Server js people would like it!
At some point we might have ESn remove ancient cruft so the horizontal arrows might stop pointing rightward.
A strong +1 for doing this in ES6. IMO ES5 gives developers a strong backward compatible cross browser foundation for the short/medium term. Thus, ES6 can risk a bit of backward compatibility - not sure how the wider ES community feels.
Why are you mixing the two still? As you note they conflict sometimes.
Security is non-negotiable. But the notes in the SES wiki do seem to indicate scope for performance optimizations - and maybe efficient machine code generation. e.g no/limited access to the global object, immutable functions. Though I don't know if that's a consideration for the SES project.
On Jun 23, 2009, at 1:21 PM, kevin curtis wrote:
At some point we might have ESn remove ancient cruft so the horizontal arrows might stop pointing rightward.
A strong +1 for doing this in ES6. IMO ES5 gives developers a strong backward compatible cross browser foundation for the short/medium term. Thus, ES6 can risk a bit of backward compatibility - not sure how the wider ES community feels.
That's not how compatibility works. We don't have an "ES5" engine and
a separate "ES6" engine, nor do many (most I think) other browser
vendors. With a single engine, we do not want too many runtime version
tests (ideally, zero) apart from strict mode.
So the way a future ESn might break compatibility would involve the
web moving away from the deprecated feature first. Examples we've
discussed recently include the never-standardized foo.arguments and
foo.caller (also the now IE-only arguments.caller) properties.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Jun 23, 2009, at 11:11 AM, Brendan Eich wrote:
On Jun 23, 2009, at 10:03 AM, kevin curtis wrote:
The requirement of security and speed don't always coincide! The
subset/dialect idea is interesting.It's two-edged.
Adding standard subsets leads to case-analysis explosion, a recipe
for bugs and reduction of interoperability.How subsets might evolve in future editions is another axis for
explosion of cases. Are subsets partially ordered in the same way
across future editions? Or should there be a total order? Can strict
mode get stricter over time?ES5 ---------------> ES6 ... ^ ^ | | ES5 strict <---?---> ES6 strict ...
The arrowhead points to the superset. At some point we might have
ESn remove ancient cruft so the horizontal arrows might stop
pointing rightward.These are good questions and there may be compelling and simple
answers, but only for very few standard subsets. If you can't draw
the lattice, it's too complicated.Ideally, IMHO, the only standard subset will be strict mode.
let x:int = 0 // ES6 type annotation to indicate this will be
turned into a c int at some point ... etcWe do not want int "for performance" if it auto-widens to double.
Adobe has experience here, and int as an annotation on local
variables (e.g. loop controls) is often a de-optimizer, yet users
over-use it "for speed".If as you propose ("C int", capital C meaning the C language, I take
it) we enable 32-bit machine int under a pragma, we'll have
wraparound bugs on the web. (People will copy and paste the pragma
to excess.)Because of the web-as-it-is, implementors have had to optimize JS as
it is used.But this has demonstrated, to me at least, that the important
language optimizations can be done well under the hood, without
hinting. IMHO this is a good use of human capital, compared to the
alternative of unleashing pragmas and machine types on the web
developer masses, where the pragmas and types add complexity and
often bite back.
I'd just like to double +1 this point. Getting things to be fast on
the web came out of a group realization that web developers were
leaning on JS since HTML wasn't evolving nearly fast enough to meet
their needs. The performance of modern JS VM's is something us lowly
DHTML hackers only dreamed of even a couple of years ago. JS got
exercised for these tasks because it was both terse and flexible.
Taking either of those properties away without a demonstrated need
feels like a much larger loss to a JS programmer than I suspect it
will to folks who are used to C++ or Java.
Getting some form of acceleration out of commonly-run scripts seems
like a great goal, but I'm dubious that it should happen at the
language level. I can more easily imagine something like HTML 5
AppCache installation kicking off a pre-compile step that allows
mostly-static script resources to elide away first-run compilation or
in other ways apply more processor power to generating better/faster-
starting code.
Alex Russell slightlyoff at google.com alex at dojotoolkit.org BE03 E88D EABB 2116 CC49 8259 CF78 E242 59C3 9723
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin)
iD8DBQFKRtGnz3jiQlnDlyMRAlHkAJ0ZT3c0PuUmjlKOdJWdrfS6FSfMwgCfa9A+ V4DPi75RD1U5WkkltL5imTU= =7yy5 -----END PGP SIGNATURE-----
On 2009-06-27, at 22:12EDT, Alex Russell wrote:
But this has demonstrated, to me at least, that the important
language optimizations can be done well under the hood, without
hinting. IMHO this is a good use of human capital, compared to the
alternative of unleashing pragmas and machine types on the web
developer masses, where the pragmas and types add complexity and
often bite back.I'd just like to double +1 this point.
Me 3.
Compiler cycles we have plenty of, it's programmer cycles we need to
optimize.
Python has a concept 'extension modules' where module can be implemented in c/c++. Also the idea of running native code in the browser has been put forward by Google's native client for running x86 in the client. MS - i think - are researching something similar.
c/c++ isn't going anywhere and the relationship between ecmascript and c/c++ is interesting. Are there any proposals for something like 'extension modules' for ES6 or do the variations in the engine implementations preclude such a thing?