forward-incompatible Function.prototype.toString requirement
On 16 April 2015 at 11:34, Michael Ficarra <mficarra at shapesecurity.com> wrote:
ES2015 section 19.2.3.5 (Function.prototype.toString) places four restrictions on the output of Function.prototype.toString, one of which is
If the implementation cannot produce a source code string that meets these
criteria then it must return a string for which eval will throw a SyntaxError exception.
What is a SyntaxError today may not be a SyntaxError tomorrow. How can an implementation return a string that will satisfy this requirement in the future when the language has added new syntax? Does the committee have a SyntaxError recommendation that can be added as a non-normative note to that section?
In the (probably unlikely) case that the language changed that way, and an affected implementation adopted that change, then it would simply have to change its toString implementation accordingly at that point. I don't see a problem there.
The part that sticks out to me is... toString on functions currently throws a syntax error when eval'd for non-named functions. Tested in chrome:
var f = function(){ return 1 }; eval(f.toString()); // SyntaxError because
function(){ return 1 }; // SyntaxError but, force an expression:
eval("(" + f.toString() + ")") // essentially clones f
This... is confusing in my opinion.
On 16 April 2015 at 14:34, Frankie Bagnardi <f.bagnardi at gmail.com> wrote:
The part that sticks out to me is... toString on functions currently throws a syntax error when eval'd for non-named functions. Tested in chrome:
var f = function(){ return 1 };eval(f.toString()); // SyntaxError // becausefunction(){ return 1 }; // SyntaxError // but, force an expression: eval("(" + f.toString() + ")") // essentially clones f
This... is confusing in my opinion.
Yeah, the spec says:
"The string representation must have the syntax of a FunctionDeclaration FunctionExpression, GeneratorDeclaration, GeneratorExpression, ClassDeclaration, ClassExpression, ArrowFunction, MethodDefinition, or GeneratorMethod depending upon the actual characteristics of the object."
which is weird. First, it doesn't really make sense, because whether something originates from a declaration vs expression isn't a "characteristic of the object". Second, making it return different syntactic classes in different cases is not particularly useful for your case.
But then again, using the result of f.toString as input to eval is A REAL BAD IDEA anyway; toString should only be used for diagnostics, not programmatically, because that is meaningless in general. So I personally don't mind the friction.
On Thu, Apr 16, 2015 at 6:36 AM, Andreas Rossberg <rossberg at google.com>
wrote:
On 16 April 2015 at 14:34, Frankie Bagnardi <f.bagnardi at gmail.com> wrote:
The part that sticks out to me is... toString on functions currently throws a syntax error when eval'd for non-named functions. Tested in chrome:
var f = function(){ return 1 };eval(f.toString()); // SyntaxError // becausefunction(){ return 1 }; // SyntaxError // but, force an expression: eval("(" + f.toString() + ")") // essentially clones f
This... is confusing in my opinion.
Yeah, the spec says:
"The string representation must have the syntax of a FunctionDeclaration FunctionExpression, GeneratorDeclaration, GeneratorExpression, ClassDeclaration, ClassExpression, ArrowFunction, MethodDefinition, or GeneratorMethod depending upon the actual characteristics of the object."
which is weird. First, it doesn't really make sense, because whether something originates from a declaration vs expression isn't a "characteristic of the object". Second, making it return different syntactic classes in different cases is not particularly useful for your case.
But then again, using the result of f.toString as input to eval is A REAL BAD IDEA anyway; toString should only be used for diagnostics, not programmatically, because that is meaningless in general. So I personally don't mind the friction.
Disagree. The purpose of this change in toString spec from ES5 is primarily to support pass-by-copy closed functions. The intent was that it always work to evaluate them as expressions, i.e., with the surrounding parens.
It is this use as pass-by-copy that motivates the specific language about which lexical variable names the printed function source can mention -- it must not require any names that were not clearly required by an author looking at the source of the original function.
Note the security benefit. When executed in an environment without ambient authority (such as SES), the evaluated function is born endowed with no more access than that granted by these lexical variables, whose bindings must therefore be provided by the eval-uator (e.g., SES's confine < code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/ses/startSES.js#901
I just meant that it seems confusing that it can both produce a FunctionExpression and "return a string for which eval will throw a SyntaxError exception."
I didn't mean to highjack this thread; the concern in the original email is noteworthy. There should be recommended syntaxes which future versions of the spec agree not to break, including current browser implementations. This gives any future engines, or modifications to current engines a clearly defined acceptable way to format these strings.
This also gives libraries which parse functions at runtime (despite how good of an idea this is) quick bail cases without using eval. Examples of this are angular and require.js.
I agree with all this. Let's gather current precedent across JS implementations of function printings are not and should not be parseable and see if we can extract a clear spec that is consistent enough with current reality.
On v8:
> Object
function Object() { [native code] }
> Object.bind(Object)
function () { [native code] }
On SpiderMonkey:
> Object
function Object() {
[native code]
}
> Object.bind(Object)
function Object() {
[native code]
}
On JSC:
> Object
function Object() {
[native code]
}
> Object.bind(Object)
function Object() {
[native code]
}
Looks promising so far! Anyone care to do a more complete investigation and write up an initial proposal?
On Thu, Apr 16, 2015 at 6:55 AM, Mark S. Miller <erights at google.com> wrote:
On Thu, Apr 16, 2015 at 6:36 AM, Andreas Rossberg <rossberg at google.com> wrote:
On 16 April 2015 at 14:34, Frankie Bagnardi <f.bagnardi at gmail.com> wrote:
The part that sticks out to me is... toString on functions currently throws a syntax error when eval'd for non-named functions. Tested in chrome:
var f = function(){ return 1 };eval(f.toString()); // SyntaxError // becausefunction(){ return 1 }; // SyntaxError // but, force an expression: eval("(" + f.toString() + ")") // essentially clones f
This... is confusing in my opinion.
Yeah, the spec says:
"The string representation must have the syntax of a FunctionDeclaration FunctionExpression, GeneratorDeclaration, GeneratorExpression, ClassDeclaration, ClassExpression, ArrowFunction, MethodDefinition, or GeneratorMethod depending upon the actual characteristics of the object."
which is weird. First, it doesn't really make sense, because whether something originates from a declaration vs expression isn't a "characteristic of the object". Second, making it return different syntactic classes in different cases is not particularly useful for your case.
But then again, using the result of f.toString as input to eval is A REAL BAD IDEA anyway; toString should only be used for diagnostics, not programmatically, because that is meaningless in general. So I personally don't mind the friction.
Disagree. The purpose of this change in toString spec from ES5 is primarily to support pass-by-copy closed functions. The intent was that it always work to evaluate them as expressions, i.e., with the surrounding parens.
Hmmm. Looking again, where the strawman says:
if eval()uated in an equivalent-enough lexical context, would result in a function with the same [[Call]] behavior as the present one. Note that the new function would have a fresh identity and none of the original’s properties, not even .prototype. (The properties could of course be transferred by other means but the identity will remain distinct.)
the spec text at < people.mozilla.org/~jorendorff/es6-draft.html#sec-function.prototype.tostring>
says only
if the string is evaluated, using eval in a lexical context that is equivalent to the lexical context used to create the original object, it will result in a new functionally equivalent object.
This "functionally equivalent" is bizarrely vaguer than "same [[Call]] behavior" and, if taken literally, is clearly wrong. As the strawman (but not the spec) says clearly, evaling the returned string is not expected to replicate the properties of the original function object. And of course it cannot replicate the original function's identity.
Allen, is this a spec bug, or was this weakening intentional? If so, why?
I did some research on this just last year — perfectionkills.com/state-of-function-decompilation-in-javascript (and originally back in 2009, when things were much wilder, perfectionkills.com/those-tricky-functions)
Corresponding tests with notes — kangax.github.io/jstests/function-decompilation
Just few days ago, I've been also thinking to add tests to ES6 compat table checking these exact (ES6 introduced) "toString representation requirements" from 19.2.3.5.
On Fri, Apr 17, 2015 at 7:10 AM, Juriy Zaytsev <kangax at gmail.com> wrote:
I did some research on this just last year — perfectionkills.com/state-of-function-decompilation-in-javascript (and originally back in 2009, when things were much wilder, perfectionkills.com/those-tricky-functions)
Corresponding tests with notes — kangax.github.io/jstests/function-decompilation
Cool, thanks. Lots of ancient history there. Of recent versions of
browsers / engines, what problems do you see in arriving at ES6 conformance?
Btw, I was amused to see the old Caja es5-to-es3 transpiler case in there:
function f(){}; // function f() { [cajoled code] } var f = function(){}; // function f$_var() { [cajoled code] }
which does raise the issue of how transpilers should cope with these requirements. If the source string produced does parse, what language layer should it be expressed in? (Note: Modern Caja uses SES, which usually does not transpile)
Just few days ago, I've been also thinking to add tests to ES6 compat table checking these exact (ES6 introduced) "toString representation requirements" from 19.2.3.5.
Please do! Beyond ES6 conformance, it looks like we could plausibly stdize something like the following for the "must not parse" case:
The function head should parse as some kind of function head.
The function body must match /^\s*{\s*\[[^{}\[\]]*\]\s*}\s*$/
i.e.,
{ [ non-"{}[]"-text ] }
The proposal would effectively commit us to never introducing a future syntax where this parses as a function body, addressing the issue Michael raises at the beginning of this thread. It would also allow reliable recognition of those cases intended not to parse for this reason, including present code recognizing future code (modulo recognizing future function heads :( ). Does this seem plausible?
Anyone care to write an initial concrete proposal?
I've published an initial proposal here: michaelficarra/Function-prototype-toString-revision
Feel free to add comments by opening a Github issue or pull request. I will add this proposal to the agenda for this week's meeting.
ES2015 section 19.2.3.5 (Function.prototype.toString) places four restrictions on the output of Function.prototype.toString, one of which is
If the implementation cannot produce a source code string that meets these
What is a SyntaxError today may not be a SyntaxError tomorrow. How can an implementation return a string that will satisfy this requirement in the future when the language has added new syntax? Does the committee have a SyntaxError recommendation that can be added as a non-normative note to that section?
Michael Ficarra