debugging interfaces
It seems to me that the topics you mention here can easily fall within the character of ECMA TC39. Personally, I think it will be better for the community if all standards work related to ECMAScript is done within one umbrella organization rather than having multiple organizations trying to divide up the ECMAScript standards turfs. Ultimately, it's the same set of browser providers and often the same individuals that have to be involved with all of these activities so it's just more efficient if we do it in a single venue.
I agree that, a standards meeting is not the ideal place to do design work. No committee is. However, the way it really works is that most of the serious design work usually gets done by a few individual outside of the "committee" process and then the design proposals are presented to be reviewed and tweaked by the committee. That isn't much different from how real design occurs in any venue.
The one possible downside for some people may be the ECMA membership requirement. However, any organization that has any actual standing to serve as avenue for creating standards is going to have some sort of criteria for participation.
This mailing list is actually not an official ECMA list. Instead it is essentially an open communications zone that is frequented both by individuals who are actively involved in ECMA TC-39 and members of the community who are interested in the evolution of ECMAScript but are not ECMA members. I think it has been working pretty well in that regard. The topic you are interested could also be discussed on this list or if it becomes too confusing intermingling these topics with language design we could set up a parallel list for debugging related discussions.
Informally, this could get started simply by starting to have discussions on this list. To turn this work into an official TC39 activity the next step would probably be for the interested individuals who work for ECMA member companies (Google, Yahoo, Microsoft, Mozilla, Apple, Opera, etc.) to talk to their organizations' TC39 representatives. If you aren't affiliated with an ECMA member you might want to talk to the ECMA secretariat about what it takes to become a member. The next TC39 meeting is the last week of July in Redmond WA and this topic could easily be put on the agenda if there is member interest.
Thanks for bringing this up, Allen
From: es-discuss-bounces at mozilla.org [mailto:es-discuss-bounces at mozilla.org] On Behalf Of Patrick Mueller Sent: Tuesday, June 09, 2009 8:02 PM To: es-discuss Subject: debugging interfaces
Today on the serverjs mailing list, the subject of standardizing debugging interfaces came up.
http://is.gd/WmuI
As Mark notes in his post on the thread, something on the level of JPDA (JDI et al), or the more "modern" JVMTI (java.sun.com/javase/6/docs/technotes/guides/jvmti) might be the sort of things to target, in terms of functionality.
I also noticed Charles McCaffieNevile mentioned standardized debugging API in a movie recently placed up on Yahoo Theatre (at about 14:50):
http://developer.yahoo.com/yui/theater/mccathienevile-dragonfly.html
On a somewhat related note, there has been some work making the debugging experience better for developers, by making use of some non-standard conventions in source code. Two of these I'm familiar with are FireFox's "//@ sourceURL" annotation to 'name' eval() and Function() code blocks, and WebKit's displayName property to name otherwise anonymous functions. In both of these cases, the functionality is provided purely for the use of developer tooling - debugging and profiling. Links to more info on these here:
http://pmuellr.blogspot.com/2009/06/debugger-friendly.html
I've run into a few people interested in looking into this, but it's not quite clear to me where work relating to this should happen. I tend to view standards groups as not the places to do design work, so didn't really think ECMA would be the right place to talk about this, but Mark Miller indicated it would be good to at least post the thought up here.
So, question is, where might folks interesting in this stuff work on this? Here? I was also thinking the nascent Open Web Advocacy group might be another place:
http://groups.google..com/group/openweb-group<http://groups.google.com/group/openweb-group>
Patrick Mueller - muellerware.org
I've wondered for some time if it weren't possible to harmonize stack traces across browsers. I have submitted a (simple) proposal on the WHATWG mailing list (lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020359.html), and got somehow "redirected" here ;-)
So here is the proposal, as I first submitted it. Please tell me if it is not exactly the best place for this. Part of the functionality proposed here also assumes a browser environment (html / js files) which probably cannot be implemented in some other environments (like JS interpreters). Also I do not care that much about the proposed property names. They are just here as examples.
Note: unfortunately, I do not "speak" Java at all, and thus I don't know JVM TI. I only skimmed through "JavaTM Virtual Machine Tool Interface (JVM TI)" (java.sun.com/javase/6/docs/technotes/guides/jvmti), but it seems it goes far beyond what I describe here.
-
What is the problem you are trying to solve? * If you want a traceback, you need to cater for multiple browsers behaviours, and different and incomplete information.
-
What is the feature you are suggesting to help solve it? * Harmonize JavaScript stack traces across browsers. There could be a global getStackTrace() method, wich would return an array of stack frames, closest one first (or last, I don't care). Every stack frame object could have the following properties:
To identify what script we are in:
- the script filename, if external script (JSFileName)
- or HTML filename, in case of an inline script (HTMLFileName)
- or, in case of an "eval()" call, a special evalScope property must be set to true
- a reference to the script tag DOM object, if the script was not eval'd (scriptTag)
To identify where in the script we are:
- a simple way to get the script's full source code, whatever way it is eval'd or included. For example, a fullJSSourceCode property
- the line number, relative to the start of the script (sourceLine)
- the line number, relative to the start of the file containing the script. It is equal to sourceLine, except if it is an inline script in an HTML file (fileLine)
- the position of the substring delimiting the instruction in the source code, relative to the start of the line. This is especially useful if the JS source is "packed" or minifed, and thus newlines have been suppressed (instructionOffsetStart and instructionOffsetEnd)
And about the environment:
- a reference to the "this" object in the given stack frame (thisObj)
- a reference to the function called (func). This is a function object.
- the arguments the function was called with, just like the arguments pseudo-array in a function (arguments)
- a reference to the variable object, that carries all of the variables that have been defined with the "var" statement, and that can be accessed in this stack frame ? (variables)
Also, exception objects that are thrown by the browser could by default have a stackTrace property. Please note that if the stack trace is obtained with getStackTrace, the closest stack frame object would always be the call to the getStackTrace() function. Thus it could easily be discarded if needed ( if( stackTrace[ 0 ].func == window.getStackTrace ) stackTrace.shift(); ).
- Why do you think browsers would implement this feature? *
Many browsers already provide a facility for debugging, each giving different informations. See for example: developer.mozilla.org/En/Core_JavaScript_1.5_Reference/Global_Objects/Error/Stack
It wouldn't be too hard to standardize that. The only potential problem I can see is keeping the references to the variables, arguments, and this object. It would be nice to have implementors' feedback on this.
- Why do you think authors would use this feature? *
See for example this page, it's an attempt to provide a stack trace in every browser: eriwen.com/javascript/js-stack-trace
- What evidence is there that this feature is desparately needed? *
It would help every JS programmer out there in JS debugging, and also libraries for unit testing, etc.
,
Jordan OSETE
On Mon, Jun 15, 2009 at 5:06 AM, Jordan Osete <jor at joozt.net> wrote:
Hi everybody.
I've wondered for some time if it weren't possible to harmonize stack traces across browsers. I have submitted a (simple) proposal on the WHATWG mailing list ( lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020359.html), and got somehow "redirected" here ;-)
So here is the proposal, as I first submitted it. Please tell me if it is not exactly the best place for this.
This is the best place. Thanks for posting.
Part of the functionality proposed here also assumes a browser environment (html / js files) which probably cannot be implemented in some other environments (like JS interpreters). Also I do not care that much about the proposed property names. They are just here as examples.
Note: unfortunately, I do not "speak" Java at all, and thus I don't know JVM TI. I only skimmed through "JavaTM Virtual Machine Tool Interface (JVM TI)" (java.sun.com/javase/6/docs/technotes/guides/jvmti), but it seems it goes far beyond what I describe here.
It does indeed. I do not suggest that a debugging API be modeled on it in any detail. However, it does have one great virtue: it is stratified in the sense explained by bracha.org/mirrors.pdf and not generally
accessible from within the computation being reflected upon. This is essential to reconcile debuggability with strong encapsulation.
In particular, one of the motivations for introducing strict mode into es5 is so that the encapsulation of strict functions can be safe even from non-strict functions. This is explained starting at slide < google-caja.googlecode.com/svn/trunk/doc/html/es5-talk/img39.html>
which corresponds to minute 41 of < www.youtube.com/watch?v=Kq4FpMe6cRs>. However, the mechanisms
explained there were insufficient, resulting in the following agreement at the last EcmaScript meeting:
On Mon, Jun 1, 2009 at 5:11 PM, Waldemar Horwat <waldemar at google.com> wrote:
Rather than describing the evil things that implementations do with F.caller, we agreed to just impose a blanket prohibition of code peeking into the environment records or identity of strict functions on the stack. This way a test suite can ensure that F.caller does nnot reveal strict functions without us having to introduce the evil things into the standard. I'll write up proposed wording.
What is the problem you are trying to solve? * If you want a traceback, you need to cater for multiple browsers behaviours, and different and incomplete information.
What is the feature you are suggesting to help solve it? * Harmonize JavaScript stack traces across browsers. There could be a global getStackTrace() method, wich would return an array of stack frames, closest one first (or last, I don't care). Every stack frame object could have the following properties:
To identify what script we are in:
- the script filename, if external script (JSFileName)
- or HTML filename, in case of an inline script (HTMLFileName)
- or, in case of an "eval()" call, a special evalScope property must be set to true
- a reference to the script tag DOM object, if the script was not eval'd (scriptTag)
To identify where in the script we are:
- a simple way to get the script's full source code, whatever way it is eval'd or included. For example, a fullJSSourceCode property
- the line number, relative to the start of the script (sourceLine)
- the line number, relative to the start of the file containing the script. It is equal to sourceLine, except if it is an inline script in an HTML file (fileLine)
- the position of the substring delimiting the instruction in the source code, relative to the start of the line. This is especially useful if the JS source is "packed" or minifed, and thus newlines have been suppressed (instructionOffsetStart and instructionOffsetEnd)
And about the environment:
- a reference to the "this" object in the given stack frame (thisObj)
If the debugging API is unstratified or generally accessible, this violates encapsulation.
- a reference to the function called (func). This is a function object.
If the debugging API is unstratified or generally accessible, this violates encapsulation.
- the arguments the function was called with, just like the arguments
pseudo-array in a function (arguments)
If the debugging API is unstratified or generally accessible, this violates encapsulation.
- a reference to the variable object, that carries all of the variables that have been defined with the "var" statement, and that can be accessed in this stack frame ? (variables)
If the debugging API is unstratified or generally accessible, this violates encapsulation.
However, if the debugging API is constructed and made available in a way that doesn't violate security, your list above may be a good start. Thanks.
Mark S. Miller wrote:
Note: unfortunately, I do not "speak" Java at all, and thus I don't know JVM TI. I only skimmed through "JavaTM Virtual Machine Tool Interface (JVM TI)" (java.sun.com/javase/6/docs/technotes/guides/jvmti), but it seems it goes far beyond what I describe here. It does indeed. I do not suggest that a debugging API be modeled on it in any detail. However, it does have one great virtue: it is stratified in the sense explained by bracha.org/mirrors.pdf and not generally accessible from within the computation being reflected upon. This is essential to reconcile debuggability with strong encapsulation.
After a quick look into this PDF (very instructive, thanks), I think I understand the advantages of having a mirror API that is clearly distinct from the rest of the ES API. However, ES already has a number of reflection features built in, that are clearly not stratified. From the prototype and constructor property, for...in statements that allow to list properties, typeof, instanceof, including new ones like the functions to define / lookup a Getter / Setter, including the actual stack traces API (like the Mozilla Stack object developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/Error/stack).
Also, there are a variety of environments a JS "thread" can run in today. For browsers, it may run in an HTML page, or in a Web worker, but it may also run in a command-line interpreter, and even as a server-side language. Will all those need different ways to get JS threads for examination ?
As for the browser, for example (sorry, it is by far the environment I am most familiar with), I think it would be nice to still allow a page to access this mirror API to "explore" another page, without needing explicit user consent under certain conditions (like a same-origin policy for the inspector / inspected page, for example). Test suites come to mind. Would this kind of things be acceptable, from a security point of view ?
It seems Opera Dragonfly is using some kind of "Scope" thing as its mirror / proxy API (dev.opera.com/articles/view/opera-dragonfly-architecture), and then the rest is pure web (Opera developers' insight would be welcome on that matter). However (still about the browser), even the proxy part can be moved to pure web technologies (like XHR, through some server), in a debugging lib. That would leave only the mirror API to define in the standard.
,
Jordan OSETE
On Tue, Jun 16, 2009 at 7:18 AM, Jordan Osete <jor at joozt.net> wrote:
After a quick look into this PDF (very instructive, thanks), I think I understand the advantages of having a mirror API that is clearly distinct from the rest of the ES API. However, ES already has a number of reflection features built in, that are clearly not stratified. From the prototype and constructor property, for...in statements that allow to list properties, typeof, instanceof, including new ones like the functions to define / lookup a Getter / Setter,
Yes, JavaScript is already pervasively reflective in a non-stratified way. However, none of these violate the encapsulation of the one and only encapsulation mechanism present in EcmaScript -- functions evaluating to lexical closure that capture the variables in their scope. All the new reflective operators introduced by ES5 were careful to respect this boundary as well. De-facto JavaScript does have further reflective operators that do violate encapsulation -- <function-instance>.caller, <function-instance>.arguments, arguments.caller, and arguments.callee. ES5
specifies that these be disabled for strict functions so that the encapsulation of strict functions remains defensible.
including the actual stack traces API (like the Mozilla Stack object developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/Error/stack ).
This does violate information encapsulation and so does threaten confidentiality. However, it provides no access and so does not threaten integrity.
Also, there are a variety of environments a JS "thread" can run in today. For browsers, it may run in an HTML page, or in a Web worker, but it may also run in a command-line interpreter, and even as a server-side language. Will all those need different ways to get JS threads for examination ?
As for the browser, for example (sorry, it is by far the environment I am most familiar with), I think it would be nice to still allow a page to access this mirror API to "explore" another page, without needing explicit user consent under certain conditions (like a same-origin policy for the inspector / inspected page, for example). Test suites come to mind. Would this kind of things be acceptable, from a security point of view ?
Not by itself, no.
It seems Opera Dragonfly is using some kind of "Scope" thing as its mirror / proxy API ( dev.opera.com/articles/view/opera-dragonfly-architecture), and then the rest is pure web (Opera developers' insight would be welcome on that matter). However (still about the browser), even the proxy part can be moved to pure web technologies (like XHR, through some server), in a debugging lib. That would leave only the mirror API to define in the standard.
Will take a look. Thanks for the pointer.
(Sorry for the late answer. Other stuff got this thread out of my mind.)
I understand that specifying a complete stratified API is necessary, however as it is still far from being ready yet, would it still be possible to implement something close to my first proposal (minus the offending parts) ? As it stays quite simplistic (and only requires defining a single function), it could be ready much earlier, and would be a convenient replacement for the time being, until a complete stratified API is ready.
Well, just in case, here is what I would change about it today:
Keep some kind of global getStackTrace() function to get an array of stack frames. Each stack frame object has a number of properties to provide information about the given stack frame. I had divided this proposal in sections:
To identify what script we are in ("Script identification" section):
- I would not remove anything from this section.
- Maybe add an evalSourceURL property to identify the source file name for "named evals" (with "hacks" like this one, if something like that is ever standardized: blog.getfirebug.com/2009/08/11/give-your-eval-a-name-with-sourceurl).
To identify where in the script we are ("In-script localization"):
- I would keep this section as it is.
And about the environment:
- This is the offending part that needs rewriting, as it is the one to propose providing direct references to objects and functions in the running environment (thus violating the JS-closures encapsuiation mechanism).
- To avoid the direct references, while still providing most of the available information, those objects could be converted to a different representation of them, a representation which would not expose everything we want to keep safe. For example, the implementation could transform the object to JSON and provide the JSON'ed representation (it would have to watch for circular references though). Or create something like a "structured clone" (www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#structured-clone), or anything of the sort.
- It would still not be perfect, as it would provide no information for things like circular references, nor functions, or native objects (both JSON and structured-cloning algorythms would have to ignore any of these). It would also not allow testing for strict equality between objects, etc. However, it would still provide a bunch of useful information, hopefully enough for most use cases, while still keeping it simple. Actually, I really like the idea of using JSON. It is something that already exists, is well known, well understood, human readable, and widely supported. As it is a string, it can also easily be sent to a server or stored somewhere for later inspection.
- So for example, if we use JSON representation, 3 properties would remain for this section: thisObj (the JSON-Representation of the this object in the given stack frame), arguments (JSON-R of the arguments object), and variables (key-value pairs containing information about every variable accessible in the stack frame). The function object currently executed can not be represented.
I hope this makes the proposal more acceptable. It can of course still be improved or extended.
,
Jordan OSETE
Mark S. Miller a écrit :
Having recently implemented stack traces in v8 I can provide some input on this from an implementation perspective.
We have pretty strict performance requirements so I did a lot of experimentation to see how much information could be collected without affecting performance noticeably. The result was that it was feasible to collect the value of this, the function being called, and the current position within the function. Collecting arguments would have been too expensive. Collecting variables would have been much too expensive.
Processing the stack trace, say converting objects to JSON, isn't a big problem from a performance perspective as long as the processing doesn't have to happen when the stack trace is captured. In v8 we capture a "bare metal" trace when an error is created and use an accessor to get the .stack property which formats the stack trace the first time it is read.
Stack traces should not be limited to built-in errors, they should be available and efficient for user-defined errors as well. In v8 we have an Error.captureStackTrace(error, cutoff_opt) function that allows you to attach a stack trace to any object, say:
function MyError(message) { this.message = message; Error.captureStackTrace(this, MyError); }
If you pass a function as the cutoff argument all stack frames above and include the topmost call to that function will not be included in the stack frame. This was intended to solve the getStackTrace problem you describe in the proposal but it turns out that it can also be useful to hide more of the stack trace, to avoid cluttering it with the internal mechanics of library code.
Having a getStackTrace function might be useful but is problematic because, as I read it, it both collects and formats the stack trace, which is expensive. For performance reasons we want to postpone formatting using an accessor. That's the reason for the otherwise odd api of passing the error object into captureStackTrace and having it set up the appropriate accessor.
Christian Plesner Hansen a écrit :
Having recently implemented stack traces in v8 I can provide some input on this from an implementation perspective.
Thank you for your feedback.
We have pretty strict performance requirements so I did a lot of experimentation to see how much information could be collected without affecting performance noticeably. The result was that it was feasible to collect the value of this, the function being called, and the current position within the function. Collecting arguments would have been too expensive. Collecting variables would have been much too expensive.
Still nice to know that the most important information (script identification and in-script localization), can be collected without too much effort.
Unfortunately the current function cannot be converted to JSON yet (as it is a function), so I don't know if it can be used. Maybe just add a functionName property to the stack frame, it would still be useful for named functions. But the state of the this object is definitely useful. It's a shame that we cannot get information about the variables and arguments objects, though. Couldn't it be possible to work around that ?
Processing the stack trace, say converting objects to JSON, isn't a big problem from a performance perspective as long as the processing doesn't have to happen when the stack trace is captured. In v8 we capture a "bare metal" trace when an error is created and use an accessor to get the .stack property which formats the stack trace the first time it is read.
Now, to me the main issue with postponing the information collection seems to be the fact that the objects could be altered between the time the stack trace is created, and the time the accessor is first read. However, if it is the only way to achieve acceptable performance, then it is still better than having no information at all. It should be clearly stated in the API description, so that developers know its limitations.
Then, if we accept the idea of keeping only a reference to the objects until information about them is explicitly requested, couldn't we do the same things for the variables and arguments objects ? I don't know the internals of V8, but when the engine enters a function, does it have some kind of reference to an object holding the accessible variables (in case a closure is created inside the given function) ? It is not important whether it is just the variables created in the function, or all of the variables created in "parent" functions. Actually I think it is even better if we can have references to the variables created in parents functions as well:
function A(){ var a = 32; function B( c ){ var b = 48; ... }; ... B( 64 ); };
Now if we get a stack trace from inside the call to function B( 64 ), and if we could have some variables object, we could just store it, and the accessor would upon reading return something like { a: 32, b: 48, c: 64 }. The user of the stack trace can sort it out by himself. (In that case, the arguments object would be an array with length=1, [ 64 ] )
Still, since V8 makes a bunch of optimizations (and other modern JS engines as well), I guess it may not keep references to all accessible variables at all times, if it knows there won't be a closure and it won't be needed...
Then, even if that is the case, there are times when performance may not matter that much, like in the development phase of an application. Would it be possible to have some kind of debugAtAnyCost mode to allow "slower but more informative" stack traces ?
Stack traces should not be limited to built-in errors, they should be available and efficient for user-defined errors as well. In v8 we
Indeed.
have an Error.captureStackTrace(error, cutoff_opt) function that allows you to attach a stack trace to any object, say:
function MyError(message) { this.message = message; Error.captureStackTrace(this, MyError); }
If you pass a function as the cutoff argument all stack frames above and include the topmost call to that function will not be included in the stack frame. This was intended to solve the getStackTrace problem you describe in the proposal but it turns out that it can also be useful to hide more of the stack trace, to avoid cluttering it with the internal mechanics of library code.
If I understand it well, Error.captureStackTrace only works with an object that has been used with throw ? It is OK with me.
However, now that I think about it, if we want to add the "functionName" to the stack frame, it would allow users to implement almost the same functionality with a loop. This may be slower than your approach however, so we can probably have both. By the way, could Error.captureStackTrace accept more than one cutoff optional argument ? This way, the user can for example remove any of the functions it may be called from.
Having a getStackTrace function might be useful but is problematic because, as I read it, it both collects and formats the stack trace, which is expensive. For performance reasons we want to postpone formatting using an accessor. That's the reason for the otherwise odd api of passing the error object into captureStackTrace and having it set up the appropriate accessor.
It seems OK to me as well, but if someone really needs a stack trace from somewhere, she can just throw an exception, catch it, and use captureStackTrace on it, so it won't really prevent code from getting a stack trace anywhere it wants. So the spec should really warn the users about possible performance degradation when using this feature.
,
Jordan OSETE
On Aug 13, 2009, at 3:38 AM, Christian Plesner Hansen wrote:
Having recently implemented stack traces in v8 I can provide some input on this from an implementation perspective.
We have pretty strict performance requirements so I did a lot of experimentation to see how much information could be collected without affecting performance noticeably. The result was that it was feasible to collect the value of this, the function being called, and the current position within the function. Collecting arguments would have been too expensive. Collecting variables would have been much too expensive.
I can understand wanting to optimize stack trace generation for the
case of your captureStackTrace() method, but for debugging purposes, I
would think the performance wouldn't matter.
There are a couple of different scenarios I can imagine you might want
stack trace information, with differing expectations of performance
and detail available:
-
runtime usage, as the example of captureStackTrace() - should be
highly optimized and may not contain full detail -
debugger usage, where the debugger has actually paused the current
thread of execution, and so performance shouldn't be an issue
(relative to performance of the current thread, which is paused), and
should contain as much detail as is feasible. -
performance tool usage - something between the two - think something
like a profiler which might need more detail than captureStackTrace()
but less than the debugger, and would also have respective performance
requirements between the two as well.
The third seems less important right now.
For the first issue, existing plain old user-land runtime usage, there
was an interesting thread in the webkit mailing list yesterday:
http://thread.gmane.org/gmane.os.opendarwin.webkit.devel/9399/focus=9407
I guess there's no specification for how to get any source location
information from an Error. That would be nice to have, even without a
captureStackTrace(), so that we don't need to have vendor-specific
SPIs to do it. Presumably something like a "source", "line" and
"column" properties, all optional. At least a common place for
implementors to hang this type of information, if they have it.
Patrick Mueller - muellerware.org
On Thu, Aug 13, 2009 at 6:49 AM, Patrick Mueller<pmuellr at yahoo.com> wrote:
There are a couple of different scenarios I can imagine you might want stack trace information, with differing expectations of performance and detail available:
runtime usage, as the example of captureStackTrace() - should be highly optimized and may not contain full detail
debugger usage, where the debugger has actually paused the current thread of execution, and so performance shouldn't be an issue (relative to performance of the current thread, which is paused), and should contain as much detail as is feasible.
performance tool usage - something between the two - think something like a profiler which might need more detail than captureStackTrace() but less than the debugger, and would also have respective performance requirements between the two as well.
A good list. Another is to accumulate traces for post-mortem debugging, as in www.hpl.hp.com/techreports/2009/HPL-2009-78.html, wiki.erights.org/wiki/Causeway. This becomes especially
important for debugging systems consisting of multiple clients and servers. From a performance perspective, this is probably in the same category as performance tools.
To beat Allen to the re-sending punch (I hope), here's a reply-all I
wrote without noticing the message to which I was replying had gone to
es5-discuss not es-discuss (where it belonged).
To add a bit more detail, the closure optimizations in Firefox 3.5 do
indeed cause some debugger stress. Imagine you can break in an
optimized closure that was "flat" -- all the upvars it used were write-
once and written before it was computed, so were copied into slots of
the closure itself. The debugger driver might mutated such an upvar
behind the back of the optimizer. We don't try to regenerate code to
deoptimize on demand here -- instead we throw an exception if the
debugger gets access to such a closure via a path that the compiler
could not statically analyze.
This is sub-optimal, i.e., a bug in our debugger interface. Fixing it
is a challenge. But we don't want to spread the a priori "compile with
-g" meme, so we'll probably fix it soon.
/be
On Aug 13, 2009, at 11:33 AM, David-Sarah Hopwood wrote:
Jordan Osete wrote:
Then, if we accept the idea of keeping only a reference to the
objects until information about them is explicitly requested, couldn't we
do the same things for the variables and arguments objects ?Keeping a reference to these would interfere with the use of optimized calling conventions where arguments are only stored in machine
registers and never reified (when 'arguments' is not referred to). Similarly for variable frames, which need not be reified unless they are captured
by a closure.
Right. Or, even moreso, if possible -- there may be no variable object
reified in an implementation that optimizes harder. ES1-5 take pains
to avoid any reference to such an object escaping. Firefox 3.5
optimizes various closure cases including non-escaping Algol-like
procedures (using a display) and closures capturing write-once
variables whose initialisers dominate the closure ("flat closures",
what Chez Scheme calls "display closures"). Assignment conversion (a
la Chez Scheme) is being explored.
If we impose a model where you have to "compile with -g" to get first-
class stack inspection, otherwise you get nothing, developers will
always turn on the deoptimizing debugger-friendly option, after being
burned by missing the chance to diagnose a bug in flagrante because
they were running in optimized mode.
I like Christian's proposal, FWIW. We should see about other VMs
implementing it.
Christian will correct me if I'm wrong, but I would assume the stack trace is captured at the point where Error.captureStackTrace is called, and so there is no requirement for the object to have been thrown.
Indeed. Capturing stack traces on error instantiation rather than throw has a number of nice properties: it avoids mutating the object being thrown, it's a simple way for rethrown errors to retain a full stack trace, and it avoids overhead on throwing.
Right. Or, even moreso, if possible -- there may be no variable object reified in an implementation that optimizes harder. ES1-5 take pains to avoid any reference to such an object escaping. Firefox 3.5 optimizes various closure cases including non-escaping Algol-like procedures (using a display) and closures capturing write-once variables whose initialisers dominate the closure ("flat closures", what Chez Scheme calls "display closures"). Assignment conversion (a la Chez Scheme) is being explored.
Exactly. In v8 the variable and arguments objects are usually not reified, the variables are stored on the stack or in registers, and sorting out which source-level variables they correspond to is fairly expensive.
If we impose a model where you have to "compile with -g" to get first-class stack inspection, otherwise you get nothing, developers will always turn on the deoptimizing debugger-friendly option, after being burned by missing the chance to diagnose a bug in flagrante because they were running in optimized mode.
In v8 we have a strict "no modes" philosophy for similar reasons.
I like Christian's proposal, FWIW. We should see about other VMs implementing it.
I considered an alternative to the Error.captureStackTrace API which was an Error.getStackTrace call that returned a StackTrace object, rather than a string, with a toString method that did the formatting. I didn't want to stray too far from the functionality in other browsers so I ended up not doing that but I still like that better than captureStackTrace.
Patrick Mueller wrote:
I can understand wanting to optimize stack trace generation for the case of your captureStackTrace() method, but for debugging purposes, I would think the performance wouldn't matter.
Agreed.
Mark S. Miller wrote:
On Thu, Aug 13, 2009 at 6:49 AM, Patrick Mueller<pmuellr at yahoo.com> wrote:
There are a couple of different scenarios I can imagine you might want stack trace information, with differing expectations of performance and detail available:
runtime usage, as the example of captureStackTrace() - should be highly optimized and may not contain full detail
debugger usage, where the debugger has actually paused the current thread of execution, and so performance shouldn't be an issue (relative to performance of the current thread, which is paused), and should contain as much detail as is feasible.
performance tool usage - something between the two - think something like a profiler which might need more detail than captureStackTrace() but less than the debugger, and would also have respective performance requirements between the two as well.
A good list. Another is to accumulate traces for post-mortem debugging, as in
Agreed, again.
Brendan Eich wrote:
If we impose a model where you have to "compile with -g" to get first-class stack inspection, otherwise you get nothing, developers will always turn on the deoptimizing debugger-friendly option, after being burned by missing the chance to diagnose a bug /in flagrante/ because they were running in optimized mode.
Indeed.
Now if I understood it correctly, what makes all this information collection performance hungry is really about the arguments and variables. Everything else, from the current function name and this object, to the file name and in-code localization could be obtained with "acceptable" performance". (Please correct me if I am wrong. In particular, to which level of detail can we get for "in-code localization" without risking serious slowdown ? File name ? Function ? Line number ? Instruction boundaries ?)
Also storing references to arguments or variables for later use is impractical, as it would slow down execution dramatically. So the main issue about the potential inclusion of variable and arguments information is that when we still have got it, we don't know if it will ever be used. Always including it means wasting performance dramatically (and is a potential nightmare for the engine developers, but I'm sure they could manage it ;) ), but never including it means that we throw away information that could potentially be useful...
Now, how about letting the user ask for that information only at one point - when it is still here ? Or better: before. It may seem foolish, but if we allow some kind of way to tell that we desperately need that information - for example in the try statement - then the engine can enter the try statement knowing that we will need it.
try { ... } catch( e, fullStackInformation ) //notice the second parameter here { ... }
Now, by including a second parameter in the catch clause, the engine knows that, for this particular try statement, we need particularly precise information, even if it may be at the cost of speed. It becomes an opt-in, but a finely grained one - not like a global compile switch. Of course, whether the second parameter is present or not, one can still use Error.captureStackTrace to get whatever information the engine can give him without performance penalties.
Brendan Eich wrote:
I like Christian's proposal, FWIW. We should see about other VMs implementing it.
Christian Plesner Hansen wrote:
I considered an alternative to the Error.captureStackTrace API which was an Error.getStackTrace call that returned a StackTrace object, rather than a string, with a toString method that did the formatting. I didn't want to stray too far from the functionality in other browsers so I ended up not doing that but I still like that better than captureStackTrace.
Although I must admit I was slightly surprised with this syntax at first, I am starting to get used to it now, and I think I like it. However I don't understand the part about returning a string. As I understood it, it returned a StackTrace object, pretty much like an array of stack frames. Could you please give us more details about what Error.getStackTrace and Error.captureStackTrace exactly do ?
,
Jordan OSETE
Now if I understood it correctly, what makes all this information collection performance hungry is really about the arguments and variables. Everything else, from the current function name and this object, to the file name and in-code localization could be obtained with "acceptable" performance". (Please correct me if I am wrong. In particular, to which level of detail can we get for "in-code localization" without risking serious slowdown ? File name ? Function ? Line number ? Instruction boundaries ?)
You can assume WLOG that capturing arguments and variables will be relatively expensive for any high-performance implementation. On the other hand that doesn't mean that we should disallow it, and having a common interface for implementations that do implement it would be good. For instance, I could imagine extending v8 stack capturing to include the arguments of just the topmost few frames.
In v8 we record the instruction pointer from which you can infer all other location information.
Now, by including a second parameter in the catch clause, the engine knows that, for this particular try statement, we need particularly precise information, even if it may be at the cost of speed. It becomes an opt-in, but a finely grained one - not like a global compile switch. Of course, whether the second parameter is present or not, one can still use Error.captureStackTrace to get whatever information the engine can give him without performance penalties.
I think this is a nice approach[1] but would require language surgery. I would prefer an approach that aligns more closely with existing implementations.
Although I must admit I was slightly surprised with this syntax at first, I am starting to get used to it now, and I think I like it. However I don't understand the part about returning a string. As I understood it, it returned a StackTrace object, pretty much like an array of stack frames. Could you please give us more details about what Error.getStackTrace and Error.captureStackTrace exactly do ?
Error.captureStackTrace(e) creates a .stack property on e that returns the stack trace as a string when accessed. Error.getStackTrace() returns a StackTrace object that describes the stack and whose toString yields a formatted stack trace string.
Error.captureStackTrace is a pretty limited and inflexible interface but consistent with how .stack works in other implementations. If you set the .stack property of your error to the StackTrace object returned by Error.getStackTrace, which was what I had in mind, you get a more clean and flexible model but code that can already handle stack traces may not be able to take advantage of it, for instance if it uses "typeof e.stack == 'string'" to check if a stack trace is present.
-- Christian
[1]: Indeed we used this approach for the neptune language, as described in this blog post: blog.quenta.org/2006/04/exceptions.html.
Jordan Osete wrote:
Also storing references to arguments or variables for later use is impractical, as it would slow down execution dramatically. So the main issue about the potential inclusion of variable and arguments information is that when we still have got it, we don't know if it will ever be used. Always including it means wasting performance dramatically (and is a potential nightmare for the engine developers, but I'm sure they could manage it ;) ), but never including it means that we throw away information that could potentially be useful...
Now, how about letting the user ask for that information only at one point - when it is still here ? Or better: before. It may seem foolish, but if we allow some kind of way to tell that we desperately need that information - for example in the try statement - then the engine can enter the try statement knowing that we will need it.
try { ... } catch( e, fullStackInformation ) //notice the second parameter here { ... }
Since the body of the try statement can call arbitrary other code, this doesn't help to decide which code should be compiled in a way that preserves extra debugging information. Remember that if we compile code to do that, it incurs overhead whether or not an exception actually occurs.
It would be possible to compile both optimized and deoptimized versions of each function, and check in the optimized version whether it is in the dynamic scope of such a 'try' block. (Actually, there's no need to restrict it to 'try' blocks if doing that.) However, that would still add the overhead of the check to the entry code for all optimized (and not inlined) functions. I think it would be an overspecification to require any such feature.
As Christian says, we might define a common interface for implementations that do want to support this, but I don't think it requires changes to language syntax. A 'runWithMoreDebugInfo(someFunction)' API would suffice.
Christian Plesner Hansen wrote :
I think this is a nice approach[1] but would require language surgery. I would prefer an approach that aligns more closely with existing implementations.
Hm, yes. A backwards compatible approach might be better indeed. I liked the syntax though, but as current implementations treat it as a syntax error, it seems it can't be helped...
David-Sarah Hopwood wrote:
It would be possible to compile both optimized and deoptimized versions of each function, and check in the optimized version whether it is in the dynamic scope of such a 'try' block. (Actually, there's no need to restrict it to 'try' blocks if doing that.) However, that would still add the overhead of the check to the entry code for all optimized (and not inlined) functions. I think it would be an overspecification to require any such feature.
As Christian says, we might define a common interface for implementations that do want to support this, but I don't think it requires changes to language syntax. A 'runWithMoreDebugInfo(someFunction)' API would suffice.
Indeed. Also, on second thought, using syntax for this detailed / not detailed choice may not be the most flexible approach.
Before your mail, I thought of using a specific function to use before the given try statement instead:
Error.detailNextTry( true );
try{ ... }
catch( e ){ ... }
...But it feels like a hack. Your solution is indeed more elegant and flexible. Dunno why I got obsessed with this try statement. ^^;
Christian Plesner Hansen wrote :
Error.captureStackTrace(e) creates a .stack property on e that returns the stack trace as a string when accessed. Error.getStackTrace() returns a StackTrace object that describes the stack and whose toString yields a formatted stack trace string.
Error.captureStackTrace is a pretty limited and inflexible interface but consistent with how .stack works in other implementations. If you set the .stack property of your error to the StackTrace object returned by Error.getStackTrace, which was what I had in mind, you get a more clean and flexible model but code that can already handle stack traces may not be able to take advantage of it, for instance if it uses "typeof e.stack == 'string'" to check if a stack trace is present.
I don't know if there is that much legacy code like that. Currently, AFAIK every implementation has a different way to get a stack trace, and every browser gives different info formated in different ways in those strings, so you have to parse it differently anyway... So unless we follow exactly what one of those existing browsers do, users will have to adapt their code to use it anyways... I would really prefer the Error.getStackTrace approach, it is as you say more flexible, and doesn't alter the error object. And if someone really wants the behavior of captureStackTrace, implementing it herself using getStackTrace should be straightforward.
,
Jordan OSETE
On Wed, Aug 19, 2009 at 3:31 AM, Jordan Osete<jor at joozt.net> wrote:
Christian Plesner Hansen wrote :
[...] for instance if it uses "typeof e.stack == 'string'" to check if a stack trace is present.
I don't know if there is that much legacy code like that. Currently, AFAIK every implementation has a different way to get a stack trace, and every browser gives different info formated in different ways in those strings
As another data point, after a bit of testing I wrote:
/**
- All the extra fields observed in Error objects on any supported
- browser which seem to carry possibly-useful diagnostic info.
- <p>
- By "extra", we means any fields other that those already
- accessible to cajoled code, namely <tt>name</tt> and
- <tt>message</tt>. */ var stackInfoFields = [ 'stack', 'fileName', 'lineNumer', // Seen in FF 3.0.3 'description', // Seen in IE 6.0.2900, but seems identical to "message" 'stackTrace', // Seen on Opera 9.51 after enabling // "opera:config#UserPrefs|Exceptions Have Stacktrace" 'sourceURL', 'line' // Seen on Safari 3.1.2 ];
By "supported browser" above, we mean the Yahoo A-grade[1] + Chrome.
Since the browser makers are all represented on this list, and since the possible non-enumerability of built-in methods prevents any systematic examination, I'll take this opportunity to ask: What else is there? On IE, is "description" indeed always equivalent to "message"? (If so, I'll drop it from the above list.)
Today on the serverjs mailing list, the subject of standardizing
debugging interfaces came up.
As Mark notes in his post on the thread, something on the level of
JPDA (JDI et al), or the more "modern" JVMTI (java.sun.com/javase/6/docs/technotes/guides/jvmti ) might be the sort of things to target, in terms of functionality.
I also noticed Charles McCaffieNevile mentioned standardized debugging
API in a movie recently placed up on Yahoo Theatre (at about 14:50):
On a somewhat related note, there has been some work making the
debugging experience better for developers, by making use of some non- standard conventions in source code. Two of these I'm familiar with
are FireFox's "//@ sourceURL" annotation to 'name' eval() and
Function() code blocks, and WebKit's displayName property to name
otherwise anonymous functions. In both of these cases, the
functionality is provided purely for the use of developer tooling -
debugging and profiling. Links to more info on these here:
I've run into a few people interested in looking into this, but it's
not quite clear to me where work relating to this should happen. I
tend to view standards groups as not the places to do design work, so
didn't really think ECMA would be the right place to talk about this,
but Mark Miller indicated it would be good to at least post the
thought up here.
So, question is, where might folks interesting in this stuff work on
this? Here? I was also thinking the nascent Open Web Advocacy group
might be another place:
Patrick Mueller - muellerware.org