Function declarations in statements
Here is something I wrote some time ago on this topic: developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Functions#Function_constructor_vs._function_declaration_vs._function_expression
However, it's pretty specific to SpiderMonkey and I didn't fully test out other ES3 implementations. It is indeed a mess. I didn't even cover variable bindings like you have.
Unfortunately, I don't think we can force function declarations to be "initialized run-time" since it's a backwards incompatible change.
-Yuh-Ruey Chen
On Mar 16, 2007, at 7:07 PM, Yuh-Ruey Chen wrote:
Unfortunately, I don't think we can force function declarations to be "initialized run-time" since it's a backwards incompatible change.
Only with certain implementations of function statements -- namely
IE's and Opera's (which intentionally cloned IE).
Since this is an ES3 extension, allowed by chapter 16, we could
codify the majority-share practice, except that it sucks. We have so
far avoided specifying function statements, preferring to leave them
to implementations to experiment with, on into the ES4 future.
On 17/03/07, Yuh-Ruey Chen <maian330 at gmail.com> wrote:
Here is something I wrote some time ago on this topic: developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Functions#Function_constructor_vs._function_declaration_vs._function_expression
However, it's pretty specific to SpiderMonkey and I didn't fully test out other ES3 implementations. It is indeed a mess. I didn't even cover variable bindings like you have.
I think the current situation can be summed up like this:
- JScript compile time initialises FunDecl, FunExpr and FunDeclInSmt, binding the identifier in all cases in the surrounding scope (spec violation).
- Opera linear_b compile time initialises FunDecl, FunExpr and FunDeclInSmt, binding the identifier of FunExpr in the contained scope.
- SpiderMonkey compile time initialises FunDecl, FunExpr. FunDeclInSmt is run time initilised. The identifier in FunExpr is bound in the contained scope. (FunDeclInSmt is still bound in surrounding scope...)
- JavaScriptCore Compile time initialises FunDecl, FunExpr. FunDeclInSmt varies between compile time, run time and parse error depending on which statement type, exact syntax used and JavaScriptCore version. The identifier in FunExpr is bound in the contained scope since this syntax was added in Safari 1.3, before that it caused a parse error.
More exactly, testing in Safari 1.3 because that's what I have on this comp:
Run time initialised:
- with-smt
- if..else-smt
- while-smt
- for-smt
- for..in-smt Those above UNDER ONE CONDITION: the FunDecl is nested in a block-smt.
- try..catch
Compile time initialised:
- labelled..block..break-smt
Parse error:
- with-smt
- if..else-smt
- while-smt
- for-smt
- for..in-smt Those above if FunDecl IS NOT nested in a block-smt.
Unfortunately, I don't think we can force function declarations to be "initialized run-time" since it's a backwards incompatible change.
In what way is it backwards incompatible?
It's a syntax error in ES3. Function declarations are only allowed in source elements context. Function expressions are not allowed in expression statements (indeed, they would be useless there, since the only way you can get a reference to a function expression is by it's return value. The name is bound in local scope only.). Thus we're speaking of defining behaviour for an area that was not covered in any way by ES3. All you will be doing is adding a missing a feature, you won't be breaking any ES3 code. And I think if you spider the web, there will be very few, or none, scripts relying on either the SpiderMonkey or the JScript behaviour.
Further, Microsoft will have to split up their currently unified handling of all types of functions in JScript if they want to fully comply to ES3 in this. Changing the FunDeclInSmt handling would be easy to do at the same time.
You're right, expression statements that start with "function" are illegal in ES3. But as you've pointed out, some popular ES3 implementations allow function declarations in these contexts and compile-time initialize them. The most significant problem with making these run-time initialized is the usage-before-declaration feature. I've seen a few cases where a function is called or referenced before it's declared in an expression statement (tho I can't provide any real world examples - I'm not working in web development anymore). Definitely bad practice, but bad practice exists.
I would actually prefer such functions to be run-time initialized, but I fear there are some existing programs where it can produce runtime errors in a ES4 context. That by itself is bad, but even worse, some programs won't even cause exceptions to be thrown.
BTW, function expressions are useful in expression statements - just consider the pseudo-block-scope idiom: (function(){...}()).
-Yuh-Ruey Chen
On 17/03/07, Yuh-Ruey Chen <maian330 at gmail.com> wrote:
You're right, expression statements that start with "function" are illegal in ES3. But as you've pointed out, some popular ES3 implementations allow function declarations in these contexts and compile-time initialize them. The most significant problem with making these run-time initialized is the usage-before-declaration feature. I've seen a few cases where a function is called or referenced before it's declared in an expression statement (tho I can't provide any real world examples - I'm not working in web development anymore). Definitely bad practice, but bad practice exists.
Well, bad practice that doesn't work in Mozilla or Safari. The code I've seen that assumes declaration before use in statements is all of the other kind - it assumes that if a function with the same name is declared in both if and else paths of a conditional, the version will be selected according to the path taken. But as I said, if you spider for it you will find very few examples of code relying on either behaviour. For some types of statements, none at all. (Labelled statements being one example which has an already very low usage, before you count this behaviour in.)
I would actually prefer such functions to be run-time initialized, but I fear there are some existing programs where it can produce runtime errors in a ES4 context. That by itself is bad, but even worse, some programs won't even cause exceptions to be thrown.
And all those programs will not work in Mozilla or Safari.
BTW, function expressions are useful in expression statements - just consider the pseudo-block-scope idiom: (function(){...}()).
Which is an expression statement being a parenthesised expression containing a call expression containing a function expression. The call expression uses the return from a function expression. An expression statement being a function expression, on the other hand, is totally useless because no reference to the function object will survive to be called or otherwise used. But then that is not allowed in the syntax, either.
I've never seen web content that uses a named function expression and
expects the name to be bound in the variable object of the execution
context. I don't recall any such bug ever being filed with
bugzilla.mozilla.org. We always implemented ES3, which says that the
name binds in an Object instance created to scope the function object
in its context's scope chain, so that the function can refer to
itself. Sorry to belabor the obvious for those of you who know ES3.
The IE JScript bug should be fixed. It's just a deviation from the
standard, as far as I can tell.
On 17/03/07, Brendan Eich <brendan at mozilla.org> wrote:
I've never seen web content that uses a named function expression and expects the name to be bound in the variable object of the execution context.
Me neither, and I'm a member of several JavaScript forums and mailing lists.
I don't recall any such bug ever being filed with bugzilla.mozilla.org. We always implemented ES3, which says that the name binds in an Object instance created to scope the function object in its context's scope chain, so that the function can refer to itself. Sorry to belabor the obvious for those of you who know ES3.
Yes. Microsoft seems to have missed that particular thing when reverse engineering JavaScript.
The IE JScript bug should be fixed. It's just a deviation from the standard, as far as I can tell.
A JScript bug I've been bitten by after writing a painfully long script looking mostly like this;
var foo={ bar:function f(){ /* bar code / setTimeout(f,200); }, baz:function f(){ / baz code */ setTimeout(f,200); } };
Easy to fix (s/(f,/(arguments.callee,/), but irritating. Never seen anyone bitten by it in the reverse direction, though.
What I'm arguing for here is that these compile time instantiations (in if..else and with in particular) defies programmer expectations. They can lead to outright bugs in programs, because they silently change the meaning of the program from developer expectations. NOBODY expects a function declared in the else-path to override the function declared in an if-path for a function call that may even be happening inside the same if-path. A syntax error would even be preferable to that behaviour, but run time instantiation would of course match programmer expectations best. So, it would be nice if ES4 actually addressed this.
On 17/03/07, Brendan Eich <brendan at mozilla.org> wrote:
The IE JScript bug should be fixed. It's just a deviation from the standard, as far as I can tell.
But then should SpiderMonkey aslo be fixed? Consider the following shell session:
js> var x = 10;
js> function f() { print(x); if (false) var x; }
js> f();
undefined js> function g() { print(f); if (false) function f() {}; }
js> g()
function f() { print(x); if (false) { var x; } }
That is, in the first f() "if (false) var x;" defines x for the whole function yet in g() the function statement does not bind the local name until executed.
For me JScript behavior is in fact more consistent. There both var and function statement is executed on the function entrance no matter where they are placed. Yet in SpiderMonkey only var has such property while function having an extra weiredness:
js> var x = 10;
js> function f() { print(x); function x() {} }
js> f()
function x() { } js> function g() { print(x); if (true) function x() {} }
js> g()
10
Now try to explain to a new guy the logic behind this.
, Igor
On 17/03/07, Igor Bukanov <igor at mir2.org> wrote:
For me JScript behavior is in fact more consistent. There both var and function statement is executed on the function entrance no matter where they are placed.
This also suggest a more reasonable behavior than the current mess:
Make a function statement to define its name as a local variable for the whole function/script it exists in in the same way as the var statement does but execute their declarations at the entrance of the block they are defined in.
var x = 20; function f() { print(x); // Should print undefined as the function statement in the block bellow // defines x as a local variable with the undefined as the initial value { print(x); // Should print the definition of function x as it is executed on the block // entrance function x() { } } }
With this rule the function statements inside blocks are no longer special compared with top-scope definitions and consistent with the var behavior. It should also be compatible with SpiderMonkey behavior in reasonable programs.
, Igor
On 17/03/07, Brendan Eich <brendan at mozilla.org> wrote:
The IE JScript bug should be fixed. It's just a deviation from the standard, as far as I can tell.
On 17/03/07, Igor Bukanov <igor at mir2.org> wrote:
But then should SpiderMonkey aslo be fixed? Consider the following shell session: [snip]
That is, in the first f() "if (false) var x;" defines x for the whole function yet in g() the function statement does not bind the local name until executed.
For me JScript behavior is in fact more consistent. There both var and function statement is executed on the function entrance no matter where they are placed. Yet in SpiderMonkey only var has such property while function having an extra weiredness: [snip] Now try to explain to a new guy the logic behind this.
Not much logic behind that, actually. It's because var declarations are statements and function declarations are source elements. So, var declarations are specified to be allowed in flow control constructs, while function declarations aren't, however inconsistent that may be.
The actual bug in JScript that we were talking about, though, is that JScript initialise FunExpr in surrounding instead of contained scope:
var f=function fn(){return fn}, fn=null; print(f()); // => null
On 17/03/07, Igor Bukanov <igor at mir2.org> wrote:
On 17/03/07, Igor Bukanov <igor at mir2.org> wrote:
For me JScript behavior is in fact more consistent. There both var and function statement is executed on the function entrance no matter where they are placed.
This also suggest a more reasonable behavior than the current mess:
Make a function statement to define its name as a local variable for the whole function/script it exists in in the same way as the var statement does but execute their declarations at the entrance of the block they are defined in.
[snip]
With this rule the function statements inside blocks are no longer special compared with top-scope definitions and consistent with the var behavior. It should also be compatible with SpiderMonkey behavior in reasonable programs.
In fact, that is exactly what I mean when I talk about run time instantiation. I never asked for you to specify that declaration takes place at run time, in fact that would be inconsistent with the other declarations, only that you delay instantiation (or perhaps it's simply assignment in this case) of the function object to the variable.
On Mar 17, 2007, at 4:41 AM, Igor Bukanov wrote:
js> var x = 10; js> function f() { print(x); if (false) var x; } js> f(); undefined js> function g() { print(f); if (false) function f() {}; } js> g() function f() { print(x); if (false) { var x; } }
That is, in the first f() "if (false) var x;" defines x for the whole function yet in g() the function statement does not bind the local name until executed.
That's right -- var and function are not equivalent. You cannot
rewrite function f() {} as var f = function (){}.
Independent of this long-standing fact, which is detectable even in
ES1-3, never mind function statements or other such extensions: the
idea with function statements in SpiderMonkey (and Rhino, IIRC) is
precisely that they allow conditional binding of DontDelete DontEnum
names to function objects, depending on control flow.
On 17/03/07, Brendan Eich <brendan at mozilla.org> wrote:
On Mar 17, 2007, at 4:41 AM, Igor Bukanov wrote: That's right -- var and function are not equivalent. You cannot rewrite function f() {} as var f = function (){}.
Hence the proposal: make function f() {} to be equivalent to var f = function() {}; f.name = "f" placed at the beginning of the block. This still allows a conditional function definitions while removing the discrepancy with vars.
, Igor
On Mar 17, 2007, at 11:46 AM, Igor Bukanov wrote:
On 17/03/07, Brendan Eich <brendan at mozilla.org> wrote:
On Mar 17, 2007, at 4:41 AM, Igor Bukanov wrote: That's right -- var and function are not equivalent. You cannot rewrite function f() {} as var f = function (){}.
Hence the proposal: make function f() {} to be equivalent to var f = function() {}; f.name = "f" placed at the beginning of the block. This still allows a conditional function definitions while removing the discrepancy with vars.
That's an incompatible change that breaks forward calls:
function odd(n) { return n == 0 ? false : even(n-1); } alert(odd(4)); function even(n) { return n == 0 ? true : odd(n-1); }
Translating to var form, ignoring intrinsic name, results in:
var odd = function (n) { return n == 0 ? false : even(n-1); } alert(odd(4)); var even = function (n) { return n == 0 ? true : odd(n-1); }
which at run-time results in an error of the "even is not a function"
kind.
On 17/03/07, Brendan Eich <brendan at mozilla.org> wrote:
That's an incompatible change that breaks forward calls:
function odd(n) { return n == 0 ? false : even(n-1); } alert(odd(4)); function even(n) { return n == 0 ? true : odd(n-1); }
Translating to var form, ignoring intrinsic name, results in:
var odd = function (n) { return n == 0 ? false : even(n-1); } alert(odd(4)); var even = function (n) { return n == 0 ? true : odd(n-1); }
No, the translated code is: var odd = function (n) { return n == 0 ? false : even(n-1); } var even = function (n) { return n == 0 ? true : odd(n-1); } alert(odd(4));
since the idea for "function f() {}" at any place in the block to mean "var f = function f() {}" at the beginning of the block. For top-level function statements this is exactly what is ES3 requires, but it also makes a consistent rules for functions declared inside blocks.
, Igor
On Mar 17, 2007, at 8:11 PM, Igor Bukanov wrote:
No, the translated code is: var odd = function (n) { return n == 0 ? false : even(n-1); } var even = function (n) { return n == 0 ? true : odd(n-1); } alert(odd(4));
since the idea for "function f() {}" at any place in the block to mean "var f = function f() {}" at the beginning of the block. For top-level function statements this is exactly what is ES3 requires,
It's still different, according to ECMA-262 10.1.3, second bulleted
item vs. third -- note how var does not create a new property if one
already exists:
• For each FunctionDeclaration in the code, in source text order,
create a property of the variable object whose
name is the Identifier in the FunctionDeclaration, whose value is the
result returned by creating a Function object
as described in section 13, and whose attributes are determined by
the type of code. If the variable object
already has a property with this name, replace its value and
attributes. Semantically, this step must follow the
creation of FormalParameterList properties.
• For each VariableDeclaration or VariableDeclarationNoIn in the
code, create a property of the variable object
whose name is the Identifier in the VariableDeclaration or
VariableDeclarationNoIn, whose value is undefined
and whose attributes are determined by the type of code. If there is
already a property of the variable object with
the name of a declared variable, the value of the property and its
attributes are not changed. Semantically, this
step must follow the creation of the FormalParameterList and
FunctionDeclaration properties. In particular, if a
declared variable has the same name as a declared function or formal
parameter, the variable declaration does
not disturb the existing property.
Backward compatibility is required here, since script on the web
declares functions with names that replace pre-defined properties
that may not have the right default attributes (in particular, may be
ReadOnly and DontDelete).
but it also makes a consistent rules for functions declared inside blocks.
This part, I agree is an improvement on the function statement rule
in SpiderMonkey and Rhino. ES4 already has |let function f(...)...|
for lexically scoped block-local function binding (hoisted to top of
block). It sounds like your proposal is the same as let function
provided the function statement is enclosed in a block. With let, a
binding such as
if (cond) let x = 42;
binds x in the enclosing block, not in an implicit block around the
if's then clause.
On 18/03/07, Brendan Eich <brendan at mozilla.org> wrote:
Backward compatibility is required here, since script on the web declares functions with names that replace pre-defined properties that may not have the right default attributes (in particular, may be ReadOnly and DontDelete).
The way I see this working, function statements would create a property on the variable object of the surrounding scope, initialise to undefined just like var, and carry on. When execution reaches the block in which the function statement was declared, the function object is assigned to that variable and the appropriate attributes are set.
This part, I agree is an improvement on the function statement rule in SpiderMonkey and Rhino. ES4 already has |let function f(...)...| for lexically scoped block-local function binding (hoisted to top of block). It sounds like your proposal is the same as let function provided the function statement is enclosed in a block. With let, a binding such as [snip] binds x in the enclosing block, not in an implicit block around the if's then clause.
The proposal still binds in the surrounding function scope, not in block scope. And it creates the variable when entering that scope. It just delays the actual assignment of the function object to that variable till control flow has reached the block containing the function statement.
Though I think the same should be happening to all statements that can contain other statements, not just block statements. For example,
function fn(){return'before if-statement'}
if(true)
function fn(){return'then-clause'}
In this case, I expect the 'before if-statement' function to be handled exactly per the ES3 rules. When execution reaches the if-statement, that variable is assigned the 'then-clause' function object. In the case that the nested statement is a block statement, like this:
function fn(){return'before if-statement'}
if(true){
function fn(){return'first'}
print(fn())
function fn(){return'second'}
}
The function objects are always assigned to the variable before anything else in the block statement, so the text displayed will be "second".
On 18/03/07, Brendan Eich <brendan at mozilla.org> wrote:
It's still different, according to ECMA-262 10.1.3, second bulleted item vs. third -- note how var does not create a new property if one already exists:
OK, lets then be 100% compatible with ECMA-262 while still making uniform rule for functions declared in any statement. That is, the suggestion for any such function declaration is to always define a new variable property whose initial value is undefined at the beginning of script or function execution and initialize it with function object at the beginning of the execution of the statement that contains the function declaration.
For the examples see the previous post from David that nicely summarizes the proposal.
Backward compatibility is required here, since script on the web declares functions with names that replace pre-defined properties that may not have the right default attributes (in particular, may be ReadOnly and DontDelete).
Then SpiderMonkey violates the standard here as declaring in a web page
function document() { }
leads to "Error: redeclaration of const document" while MSIE does replace otherwise ReadOnly/DontDelete document by the function definition.
, Igor
According to ES3, function declarations are not legal in statements and should be syntax errors. However, all browser hosted implementations allow function declarations in statements, but do it differently.
JScript compile time initialises the function in surrounding scope (likewise compile time initialises function expressions binding the identifier in the surounding scope, instead of the contained scope). Opera's linear_b compile time initialises. SpiderMonkey run time initialises when control flow reaches the function declaration. JavaScriptCore compile time initialise some cases and run time initialise others.
Now, I don't think any JavaScript developer expects a function declaration within a loop, a conditional, a catch statment or a with statement to be usable before control flow reaches that point. Especially I don't think they expect that if they declare a function in an if-statement and again in the else-clause, the else clause will override the if clause even if the else clause an under no circumstances be the taken path.
All in all, I think expectations in these cases are that
function functionname(){}
is exactly equal to
var functionname=function functionname(){};
I feel there is no real reason to compile time initialise these - it doesn't make sense control flow wise, it defies programmer expectations, and there is no apparent consensus among implementations.
Is this something ES4 addresses? If not, it would be nice for programmers if you did, to increase implementation interoperability and make reality match programmer expectations.
To copy my example cases from my Opera bug report on the issue :
A few tests covering all examples I could think of straight away:
var o=false; with({o:true}){ function f(){ return o; } } alert('FunDecl in with-smt is run time instantiated? '+f()+' (wanted answer: true)');
var usebeforedeclare=typeof fn==='function'; if(true){ function fn(){ return true; } }else{ function fn(){ return false; } } alert('FunDecl in if-smt is compile time instantiated: '+usebeforedeclare+' (wanted answer: false)\nThe correct path is taken: '+fn()+' (wanted answer: true)');
bogus:{ break bogus; function fnc(){} } alert('FunDecl in labelled statement is compile time instantiated: '+(typeof fnc==='function')+' (wanted answer: false)');
while(false) function func(){} alert('FunDecl in while-smt is compile time instantiated: '+(typeof func==='function')+' (wanted answer: false)');
for(;false;) function funct(){} alert('FunDecl in for-smt is compile time instantiated: '+(typeof funct==='function')+' (wanted answer: false)');
for(var t in {}) function functi(){} alert('FunDecl in for..in-smt is compile time instantiated: '+(typeof functi==='function')+' (wanted answer: false)');
try{}catch(e){ function functio(){} } alert('FunDecl in try..catch-smt is compile time instantiated: '+(typeof functio==='function')+' (wanted answer: false)');
I have no idea of the implications of this for typed code (haven't read the current draft that thoroughly yet), but I still feel there is no logic to compile time initialising at least the loop, conditional and with-statement cases. The catch statement and labelled+break statement cases are less of an issue, though it would again make more sense with regard to programmer expectations to follow control flow.