Strict undefined this
Mark S. Miller wrote:
Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't have a strong preference, but I think I prefer #b (it's safer) to #c (it more clearly preserves Tennent Correspondence and is generally easier to reason about). But I'd be happy with either.
This question needs to be settled for the ES3.1 spec, and so has a greater urgency than many of our other recent discussions.
I share your discomfort with #a. I prefer #c because it gives the developer the option to do stuff like
if (this === undefined) {
throw "Did you forgot to use the new operator when calling F?";
}
That would allow a library developer to provide more useful diagnostic messages. That is a low-value advantage, but I think it does break the tie. I don't see #b being any safer.
Douglas Crockford wrote:
Mark S. Miller wrote:
Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't have a strong preference, but I think I prefer #b (it's safer) to #c (it more clearly preserves Tennent Correspondence and is generally easier to reason about). But I'd be happy with either.
This question needs to be settled for the ES3.1 spec, and so has a greater urgency than many of our other recent discussions.
I share your discomfort with #a. I prefer #c because it gives the developer the option to do stuff like
if (this === undefined) { throw "Did you forgot to use the new operator when calling F?"; }
Coming at this from a non-theoretical perspective, I'd say that choice c is the one that makes intuitive sense, and the one that is easy to explain (say, if one were writing a book about this language). If safety and theoretical concerns don't dictate a choice, I'd argue that common sense suggests c.
David Flanagan
#c wins by maximizing TC purity, utility, and safety.
On Fri, Aug 29, 2008 at 22:51, Brendan Eich <brendan at mozilla.org> wrote:
#c wins by maximizing TC purity, utility, and safety.
/be
d) Lexical scoping
I thought one of the points with changing this in strict mode was to make it use lexical scoping where it "makes sense"? I do agree that preventing access to global is a noble goal and I hope we can still achieve that. My suggestion is to use lexical scoping and special case this as undefined if lexical scoping would make this the global object.
use strict;
var o1 = { f: function() { function g() { print(this); } g(); }, toString: function() { return 'o1'; } }
o1.f(); // prints o1
function f() { print(this); }
f(); // prints undefined since this would be lexically resolved to [[Global]] print(this); // prints [[Global]], since direct access.
var o2 = { toString: funciton() { return 'o2'; } };
o2.f = f; o2.f(); // prints o2 since this is bound by property lookupl
It might be a bit tricky to spec when this as [[Global]] needs to be changed to undefined
If we are changing this we need to realize that this in local functions is a big issue for people learning JS and even experienced programmers make that mistake once in a while. Changing this without fixing the most important issue peole have with this would be a mistake.
On Sat, Aug 30, 2008 at 2:02 PM, Erik Arvidsson <erik.arvidsson at gmail.com> wrote:
On Fri, Aug 29, 2008 at 22:51, Brendan Eich <brendan at mozilla.org> wrote:
#c wins by maximizing TC purity, utility, and safety.
/be
d) Lexical scoping
I thought one of the points with changing this in strict mode was to make it use lexical scoping where it "makes sense"?
I agree with all the goals you state. There was an earlier proposal, made by Crock and agreed to before March by the overall committee (both the 3.1 and 4 factions). It was even somewhat orthogonal to strict-mode, and would have made the non-strict language more intuitive as well. However, at the March meeting I presented something like the following example which demonstrates a fatal ambiguity:
function PointMaker() {
this.count = 0;
that = this;
function Point(x, y) {
this.x = x;
this.y = y;
that.count++;
}
Point.prototype.getX = function() { return this.x; }
Point.prototype.setX = function(x) { this.x = x; }
this.Point = Point;
}
// Each PointMaker is a factory for new points that counts how many points it made: var pointMaker = new PointMaker();
foo(new (pointMaker.Point)(3,5));
// How many has it made so far? print(pointMaker.count); // 1
If foo() only uses the point it's been given in the expected ways, all is cool. But what if foo() says:
function foo(pt) { var setX = pt.setX; setX(44); }
or equivalently
function foo(pt) { (true && pt.setX)(44); }
Under Crock's proposed rule, this will bring about the effect of
pointMaker.x = 44;
even though one would have thought foo() had no access path whatsoever to pointMaker. In other words, a lexical-this-defaulting rule as Crock proposed, and as I gather you seem to be proposing, would cause the 'this' in setX to be bound to the lexically enclosing 'this' when setX is called as a function. This violates it's authors intent in the same way that the non-strict global-this-capture does. The only difference is that inappropriate access is being provided to pointMaker rather than the global object.
If a lexical this-capture rule can be made to work without these kinds of problems, it would indeed be better than having to explain ".bind(this)" to ES programmers. Suggestions?
Given strict function F that internally uses "this" as an rvalue, what should happen when F is called as a function? Concretely, given
"use strict"; function F(t) { if (t) { return this; } else { return 0; } }
what should be the outcome of F(true) and F(false)? Previously, we have examined at least the following three possibilities and agreed on the first:
a) F(true) throws. F(false) returns 0.
The rule is that, in an execution context in which 'this' is bound to null or undefined, evaluating 'this' as an rvalue in strict code throws. (In non-struct code it returns the global object.)
b) F(anything) always throws before executing any code in its body.
The rule would be: if F is a strict function which mentions 'this' freely, then an attempt to call it with 'this' bound to null or undefined throws rather than entering F.
c) F(true) returns undefined. F(false) returns 0.
The rule would be that no coercion or special case happens at all. Whatever value is associated with 'this' in the present execution context is returned.
With regard to the original safety concern, any of these three are fine solutions. They all prevent the privilege escalation attack of obtaining access to the global object by magic. However, I've just noticed that the first solution adds yet another break with Tennent Correspondence. The other two bend Tennent Correspondence, but don't break it:
"use strict"; function test(cond,then,els) { if (cond) { return then(); } else { return els(); } } function F2(t) { return test(t, (function(){ return this;}).bind(this), function() { return 0; } ); }
The bending is that, instead of forming a closure around an expression by surrounding it simply with "function() {return ...;}", if the original expression mentions "this" freely, you'd have to add a ".bind(this)" on the end to get a proper lambda abstraction of that expression. Other variations:
Do renaming rather than binding
function F3(t) { const that = this; return test(t, function(){ return that;}, function() { return 0; } ); }
Always bind unconditionally, whether or not the expression in question mentions "this":
function F4(t) { return test(t, (function(){ return this;}).bind(this), (function() { return 0; }).bind(this) ); }
Under which rule are which of these closure-based rewrites equivalent to the original F?
Under rule #a, none are equivalent to F because F(false) doesn't throw but F2(false), F3(false), and F4(false) would all throw.
Under rule #b, all mention 'this' freely, and so F*(*) would throw on entry. However, the always-bind-unconditionally variation would still break Tennent Correspondence under rule #b, since it will introcude a this-breakage into a function G that never mentions 'this'.
Under rule #c, F*(true) would always yield undefined. Any of these variations would satisfy Tennent Correspondence.
Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't have a strong preference, but I think I prefer #b (it's safer) to #c (it more clearly preserves Tennent Correspondence and is generally easier to reason about). But I'd be happy with either.
This question needs to be settled for the ES3.1 spec, and so has a greater urgency than many of our other recent discussions.