Source maps (was: Multiline Strings)

# Nick Fitzgerald (11 years ago)

4. Browsers are still all over the place in how they report Error stack trace information.

We (Firefox Developer Tools) purposefully don't expose source mapped stacks to the web because it would require adding some kind of API to know when source maps are done being fetched or blocking(!) on Error.prototype.stack until the source map is fetched. We also avoid fetching source maps unless the debugger is open, so this would expose to the web if the debugger is open. Furthermore, we wouldn't want to only have the source mapped stack, you would also want the plain JS stack if you think that the source map could be bogus or if you are debugging the source maps you are generating as a tool author. This would further complicate the stack string.

1. There is not yet a standard for sourcemaps. But see Source Map Revision 3 Proposal, Chrome Developer Tools Docs, and mozilla/source-map. Would someone care to champion this for inclusion in ES7?

If a debug format for targeting JavaScript were to be standardized, it should do more than simply file/line/column translation. My thoughts on this subject outgrew an email reply, so I have collected them here: fitzgeraldnick.com/weblog/55

# K. Gadd (11 years ago)

I agree that the current source maps format does a poor job of handling many translation/transpiling/js-as-compilation-target scenarios. I do think it is worthwhile to try and build on the existing format instead of reinvent it or significantly overhaul it, though. The things it does, it does a pretty decent job of [1], and it seems to be composable, so you can update your source map at each stage in your JS build pipeline. It just needs to do a good job of mapping other things like scopes, stack frames, etc.

I'll also point out that the current proposal (rev 3, as linked) is very opaque, and despite multiple readings I failed to understand core features until someone explained them to me. Even now, I can't really tell how variable name mapping is accomplished with the current proposal, and VLQ is barely understandable. A clearer description of the spec would probably make it much easier for people to generate usable source maps or write software that uses them.

My particular projects would definitely benefit greatly from a usable version of the source maps spec, so I would love to help contribute to the process of improving them. I have not seen any obvious way to contribute in the past as the format seems to have been the result of private development at Google and/or Mozilla. I do applaud the original developers' attempts to keep the spec narrowly focused in terms of what problems it solves and how it solves them.

-kg

[1] One exception to this: Dear god, why are the mappings a giant string full of oddly-encoded variable-length data represented as base64? I cannot imagine a worse format. This fails on multiple counts:

a) I can't see any way to efficiently stream in this data; you have to parse it all at once (especially since it's JSON) b) You have to perform multiple passes on this data to make sense of it, unless you create a customized parser: First you have to parse the JSON, then you have to parse the mapping string to find the split points (the semicolons), and then once you've done that you have to undo the weird base64+VLQ encoding to get actual integers that you can use as an internal representation. The third step can be deferred, at least. c) The fact that the data is all in one big chunk (even if it is condensed) means that you have to generate it all at once and load it all at once. That seems intrinsically awful for large source files and it increases memory demands on each stage of your compilation pipeline (this may be a big issue given that compilation can already be really demanding on memory.)

The alternate 'sections' format seems to address some of this, but it is not clear if it addresses all of it. It is unclear to me why mappings is a single string instead of an array of smaller strings. In all honesty, I think this entire format could be replaced with a considerably simpler protobuf-based format, and all the parsing/generation code could be automatically generated by the tools that handle protobuf formats. It would be a win for everyone (but then we'd have to figure out a way to roll the new format out; yuuuuck.) JSON + base64 is simply not appropriate for a file format that will largely be handled by JS runtimes and debuggers with significant performance and memory concerns.

# joe (11 years ago)

On Wed, Mar 12, 2014 at 2:00 PM, Nick Fitzgerald <fitzgen at gmail.com> wrote:

If a debug format for targeting JavaScript were to be standardized, it should do more than simply file/line/column translation. My thoughts on this subject outgrew an email reply, so I have collected them here: fitzgeraldnick.com/weblog/55

Your AST annotation idea is interesting. How would it work? Would the sourcemaps contain the full AST themselves, or just paths? I assume the AST tree would be based on ECMAScript's grammar, but that begs the question of how do you deal with language extensions.

Or would the sourcemaps contain their own AST definitions?

# David Nolen (11 years ago)

As the maintainer of the ClojureScript compiler this doesn't sound like much of a simplification. The sum total of source map support in ClojureScript is < 400 lines of Clojure. To support what's being proposed would add a significant amount of complexity for something we don't care about at all - the JavaScript AST. We currently rely entirely on the Google Closure Compiler for the final pass as it offers best of class minification, optimization, and dead code elimination.

A source map format based on annotating a JS AST seems to introduce a lot of complexity once you start thinking about how this information will be preserved over multiple stages of minification, optimization, and dead code elimination.

Some of your suggestions also seem to me to be best handled by other means and don't belong in a source map proposal at all - REPL support and printing/presentation of language objects.

FWIW, the ClojureScript community has completely embraced the existing source map technology. Some of the issues raised like scope we find to be minor issues that don't really impeded effective debugging as we try to avoid renaming and unclear munging as much as possible.

Honestly the two things we really want - REPL support and printing of ClojureScript objects could easily be addressed by providing appropriate simple hooks into the dev tools offered by a browser vendor. To have to bother with generating and annotating JS ASTs to achieve these two things just sounds like pointless work.

# Nick Fitzgerald (11 years ago)

I'm not married to the AST format I proposed.

I do feel very strongly that each language targeting JS shouldn't have to write a browser devtools extension for every browser its users want to debug.

I feel very strongly that users debugging their sources that were compiled to js should be able to set watch expressions and conditional breakpoints in their source language.

I feel very strongly that users should be able to inspect their program's environment and bindings whether or not those bindings have been compiled into JS variables or into an index in a typed array.

I feel very strongly that users should be able to inspect their values and data types directly rather than the implementation of those data types when compiled to JS. (Imagine using GDB and only printing binary blobs instead of a nice printout of a struct and its slot names and corresponding values).

Perhaps not all of that belongs in the debug format, but the functionality should be exposed somehow.

One thing I tried to stress about the importance of an extensible format was that it would be easy for compilers to progressively add more debugging information to the "source maps" they generated. If the compiler should decide that it will only ever do source location mapping, that would be fine as well.

Nick

# Bill Frantz (11 years ago)

On 3/14/14 at 3:02 PM, fitzgen at gmail.com (Nick Fitzgerald) wrote:

I feel very strongly that users debugging their sources that were compiled to js should be able to set watch expressions and conditional breakpoints in their source language.

My experience with debuggers says the while the vast majority of the time you want to be able to debug it in the language you wrote, sometimes you want to debug it in the language it broke in. Being able to look at the lower level language can clear up misconceptions about what a higher level construct means. It can also reveal compiler bugs.

It should also be recognized that all compiled programs break in machine language. :-)

# K. Gadd (11 years ago)

The accuracy of this aside, history shows that most of my users are not satisfied by 'just debug the JS, it's fairly readable'. Maybe emscripten, gwt, etc. users are more fluent in JS and don't mind debugging it, but based on what I've seen, maybe not...

I do think it's important that source maps don't obscure what's happening at the JS level, though - presumably all the modern debuggers let you toggle them back off once they're loaded, so that's satisfied?

# Brendan Eich (11 years ago)

I think so. Bill's point is well taken, but a tangent. The problem people face is debugging in their primary source language, mainly. Any problems with JS-level debugging are lesser and more readily solved.