Strict undefined this

# Mark S. Miller (17 years ago)

Given strict function F that internally uses "this" as an rvalue, what should happen when F is called as a function? Concretely, given

"use strict"; function F(t) { if (t) { return this; } else { return 0; } }

what should be the outcome of F(true) and F(false)? Previously, we have examined at least the following three possibilities and agreed on the first:

a) F(true) throws. F(false) returns 0.

The rule is that, in an execution context in which 'this' is bound to null or undefined, evaluating 'this' as an rvalue in strict code throws. (In non-struct code it returns the global object.)

b) F(anything) always throws before executing any code in its body.

The rule would be: if F is a strict function which mentions 'this' freely, then an attempt to call it with 'this' bound to null or undefined throws rather than entering F.

c) F(true) returns undefined. F(false) returns 0.

The rule would be that no coercion or special case happens at all. Whatever value is associated with 'this' in the present execution context is returned.

With regard to the original safety concern, any of these three are fine solutions. They all prevent the privilege escalation attack of obtaining access to the global object by magic. However, I've just noticed that the first solution adds yet another break with Tennent Correspondence. The other two bend Tennent Correspondence, but don't break it:

"use strict"; function test(cond,then,els) { if (cond) { return then(); } else { return els(); } } function F2(t) { return test(t, (function(){ return this;}).bind(this), function() { return 0; } ); }

The bending is that, instead of forming a closure around an expression by surrounding it simply with "function() {return ...;}", if the original expression mentions "this" freely, you'd have to add a ".bind(this)" on the end to get a proper lambda abstraction of that expression. Other variations:

Do renaming rather than binding

function F3(t) { const that = this; return test(t, function(){ return that;}, function() { return 0; } ); }

Always bind unconditionally, whether or not the expression in question mentions "this":

function F4(t) { return test(t, (function(){ return this;}).bind(this), (function() { return 0; }).bind(this) ); }

Under which rule are which of these closure-based rewrites equivalent to the original F?

Under rule #a, none are equivalent to F because F(false) doesn't throw but F2(false), F3(false), and F4(false) would all throw.

Under rule #b, all mention 'this' freely, and so F*(*) would throw on entry. However, the always-bind-unconditionally variation would still break Tennent Correspondence under rule #b, since it will introcude a this-breakage into a function G that never mentions 'this'.

Under rule #c, F*(true) would always yield undefined. Any of these variations would satisfy Tennent Correspondence.

Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't have a strong preference, but I think I prefer #b (it's safer) to #c (it more clearly preserves Tennent Correspondence and is generally easier to reason about). But I'd be happy with either.

This question needs to be settled for the ES3.1 spec, and so has a greater urgency than many of our other recent discussions.

Given strict function F that internally uses "this" as an rvalue, what
should happen when F is called as a function? Concretely, given

"use strict";
function F(t) {
    if (t) {
        return this;
    } else {
        return 0;
    }
}

what should be the outcome of F(true) and F(false)? Previously, we
have examined at least the following three possibilities and agreed on
the first:

a) F(true) throws. F(false) returns 0.

The rule is that, in an execution context in which 'this' is bound to
null or undefined, evaluating 'this' as an rvalue in strict code
throws. (In non-struct code it returns the global object.)

b) F(anything) always throws before executing any code in its body.

The rule would be: if F is a strict function which mentions 'this'
freely, then an attempt to call it with 'this' bound to null or
undefined throws rather than entering F.

c) F(true) returns undefined. F(false) returns 0.

The rule would be that no coercion or special case happens at all.
Whatever value is associated with 'this' in the present execution
context is returned.



With regard to the original safety concern, any of these three are
fine solutions. They all prevent the privilege escalation attack of
obtaining access to the global object by magic. However, I've just
noticed that the first solution adds yet another break with Tennent
Correspondence. The other two bend Tennent Correspondence, but don't
break it:

"use strict";
function test(cond,then,els) {
    if (cond) { return then(); } else { return els(); }
}
function F2(t) {
    return test(t,
        (function(){ return this;}).bind(this),
        function() { return 0; }
    );
}

The bending is that, instead of forming a closure around an expression
by surrounding it simply with "function() {return ...;}", if the
original expression mentions "this" freely, you'd have to add a
".bind(this)" on the end to get a proper lambda abstraction of that
expression. Other variations:

Do renaming rather than binding

function F3(t) {
    const that = this;
    return test(t,
        function(){ return that;},
        function() { return 0; }
    );
}

Always bind unconditionally, whether or not the expression in question
mentions "this":

function F4(t) {
    return test(t,
        (function(){ return this;}).bind(this),
        (function() { return 0; }).bind(this)
    );
}

Under which rule are which of these closure-based rewrites equivalent
to the original F?

Under rule #a, none are equivalent to F because F(false) doesn't throw
but F2(false), F3(false), and F4(false) would all throw.

Under rule #b, all mention 'this' freely, and so F*(*) would throw on
entry. However, the always-bind-unconditionally variation would still
break Tennent Correspondence under rule #b, since it will introcude a
this-breakage into a function G that never mentions 'this'.

Under rule #c,  F*(true) would always yield undefined. Any of these
variations would satisfy Tennent Correspondence.

Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't
have a strong preference, but I think I prefer #b (it's safer) to #c
(it more clearly preserves Tennent Correspondence and is generally
easier to reason about). But I'd be happy with either.

This question needs to be settled for the ES3.1 spec, and so has a
greater urgency than many of our other recent discussions.

-- 
    Cheers,
    --MarkM

# Douglas Crockford (17 years ago)

Mark S. Miller wrote:

Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't have a strong preference, but I think I prefer #b (it's safer) to #c (it more clearly preserves Tennent Correspondence and is generally easier to reason about). But I'd be happy with either.

This question needs to be settled for the ES3.1 spec, and so has a greater urgency than many of our other recent discussions.

I share your discomfort with #a. I prefer #c because it gives the developer the option to do stuff like

 if (this === undefined) {
     throw "Did you forgot to use the new operator when calling F?";
 }

That would allow a library developer to provide more useful diagnostic messages. That is a low-value advantage, but I think it does break the tie. I don't see #b being any safer.

Mark S. Miller wrote:

> Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't
> have a strong preference, but I think I prefer #b (it's safer) to #c
> (it more clearly preserves Tennent Correspondence and is generally
> easier to reason about). But I'd be happy with either.
> 
> This question needs to be settled for the ES3.1 spec, and so has a
> greater urgency than many of our other recent discussions.

I share your discomfort with #a. I prefer #c because it gives the developer the 
option to do stuff like

     if (this === undefined) {
         throw "Did you forgot to use the new operator when calling F?";
     }

That would allow a library developer to provide more useful diagnostic messages. 
That is a low-value advantage, but I think it does break the tie. I don't see #b 
being any safer.

# David Flanagan (17 years ago)

Douglas Crockford wrote:

Mark S. Miller wrote:

Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't have a strong preference, but I think I prefer #b (it's safer) to #c (it more clearly preserves Tennent Correspondence and is generally easier to reason about). But I'd be happy with either.

This question needs to be settled for the ES3.1 spec, and so has a greater urgency than many of our other recent discussions.

I share your discomfort with #a. I prefer #c because it gives the developer the option to do stuff like
 if (this === undefined) {
     throw "Did you forgot to use the new operator when calling F?";
 }

Coming at this from a non-theoretical perspective, I'd say that choice c is the one that makes intuitive sense, and the one that is easy to explain (say, if one were writing a book about this language). If safety and theoretical concerns don't dictate a choice, I'd argue that common sense suggests c.

David Flanagan

Douglas Crockford wrote:
> Mark S. Miller wrote:
> 
>> Therefore, I'd prefer either #b or #c to #a. Between #b and #c I don't
>> have a strong preference, but I think I prefer #b (it's safer) to #c
>> (it more clearly preserves Tennent Correspondence and is generally
>> easier to reason about). But I'd be happy with either.
>>
>> This question needs to be settled for the ES3.1 spec, and so has a
>> greater urgency than many of our other recent discussions.
> 
> I share your discomfort with #a. I prefer #c because it gives the developer the 
> option to do stuff like
> 
>      if (this === undefined) {
>          throw "Did you forgot to use the new operator when calling F?";
>      }
> 

Coming at this from a non-theoretical perspective, I'd say that choice c 
is the one that makes intuitive sense, and the one that is easy to 
explain (say, if one were writing a book about this language).  If 
safety and theoretical concerns don't dictate a choice, I'd argue that 
common sense suggests c.

	David Flanagan

# Brendan Eich (17 years ago)

#c wins by maximizing TC purity, utility, and safety.

#c wins by maximizing TC purity, utility, and safety.

/be
>

# Erik Arvidsson (17 years ago)

On Fri, Aug 29, 2008 at 22:51, Brendan Eich <brendan at mozilla.org> wrote:

#c wins by maximizing TC purity, utility, and safety.

/be

d) Lexical scoping

I thought one of the points with changing this in strict mode was to make it use lexical scoping where it "makes sense"? I do agree that preventing access to global is a noble goal and I hope we can still achieve that. My suggestion is to use lexical scoping and special case this as undefined if lexical scoping would make this the global object.

use strict;

var o1 = { f: function() { function g() { print(this); } g(); }, toString: function() { return 'o1'; } }

o1.f(); // prints o1

function f() { print(this); }

f(); // prints undefined since this would be lexically resolved to [[Global]] print(this); // prints [[Global]], since direct access.

var o2 = { toString: funciton() { return 'o2'; } };

o2.f = f; o2.f(); // prints o2 since this is bound by property lookupl

It might be a bit tricky to spec when this as [[Global]] needs to be changed to undefined

If we are changing this we need to realize that this in local functions is a big issue for people learning JS and even experienced programmers make that mistake once in a while. Changing this without fixing the most important issue peole have with this would be a mistake.

On Fri, Aug 29, 2008 at 22:51, Brendan Eich <brendan at mozilla.org> wrote:
> #c wins by maximizing TC purity, utility, and safety.
>
> /be

d) Lexical scoping

I thought one of the points with changing *this* in strict mode was to
make it use lexical scoping where it "makes sense"?  I do agree that
preventing access to global is a noble goal and I hope we can still
achieve that.  My suggestion is to use lexical scoping and special
case *this* as *undefined* if lexical scoping would make *this* the
global object.

use strict;

var o1 = {
  f: function() {
    function g() {
      print(this);
    }
    g();
  },
  toString: function() {
    return 'o1';
  }
}

o1.f(); // prints o1

function f() {
  print(this);
}

f(); // prints undefined since this would be lexically resolved to [[Global]]
print(this); // prints [[Global]], since direct access.

var o2 = {
  toString: funciton() {
    return 'o2';
  }
};

o2.f = f;
o2.f(); // prints o2 since this is bound by property lookupl

It might be a bit tricky to spec when *this* as [[Global]] needs to be
changed to *undefined*

If we are changing *this* we need to realize that *this* in local
functions is a big issue for people learning JS and even experienced
programmers make that mistake once in a while.  Changing *this*
without fixing the most important issue peole have with *this* would
be a mistake.

-- 
erik

# Mark S. Miller (17 years ago)

On Sat, Aug 30, 2008 at 2:02 PM, Erik Arvidsson <erik.arvidsson at gmail.com> wrote:

On Fri, Aug 29, 2008 at 22:51, Brendan Eich <brendan at mozilla.org> wrote:

#c wins by maximizing TC purity, utility, and safety.

/be

d) Lexical scoping

I thought one of the points with changing this in strict mode was to make it use lexical scoping where it "makes sense"?

I agree with all the goals you state. There was an earlier proposal, made by Crock and agreed to before March by the overall committee (both the 3.1 and 4 factions). It was even somewhat orthogonal to strict-mode, and would have made the non-strict language more intuitive as well. However, at the March meeting I presented something like the following example which demonstrates a fatal ambiguity:

function PointMaker() {
  this.count = 0;
  that = this;

  function Point(x, y) {
    this.x = x;
    this.y = y;
    that.count++;
  }
  Point.prototype.getX = function() { return this.x; }
  Point.prototype.setX = function(x) { this.x = x; }

  this.Point = Point;
}

// Each PointMaker is a factory for new points that counts how many points it made: var pointMaker = new PointMaker();

foo(new (pointMaker.Point)(3,5));

// How many has it made so far? print(pointMaker.count); // 1

If foo() only uses the point it's been given in the expected ways, all is cool. But what if foo() says:

function foo(pt) { var setX = pt.setX; setX(44); }

or equivalently

function foo(pt) { (true && pt.setX)(44); }

Under Crock's proposed rule, this will bring about the effect of

pointMaker.x = 44;

even though one would have thought foo() had no access path whatsoever to pointMaker. In other words, a lexical-this-defaulting rule as Crock proposed, and as I gather you seem to be proposing, would cause the 'this' in setX to be bound to the lexically enclosing 'this' when setX is called as a function. This violates it's authors intent in the same way that the non-strict global-this-capture does. The only difference is that inappropriate access is being provided to pointMaker rather than the global object.

If a lexical this-capture rule can be made to work without these kinds of problems, it would indeed be better than having to explain ".bind(this)" to ES programmers. Suggestions?

On Sat, Aug 30, 2008 at 2:02 PM, Erik Arvidsson
<erik.arvidsson at gmail.com> wrote:
> On Fri, Aug 29, 2008 at 22:51, Brendan Eich <brendan at mozilla.org> wrote:
>> #c wins by maximizing TC purity, utility, and safety.
>>
>> /be
>
> d) Lexical scoping
>
> I thought one of the points with changing *this* in strict mode was to
> make it use lexical scoping where it "makes sense"?

I agree with all the goals you state. There was an earlier proposal,
made by Crock and agreed to before March by the overall committee
(both the 3.1 and 4 factions). It was even somewhat orthogonal to
strict-mode, and would have made the non-strict language more
intuitive as well. However, at the March meeting I presented something
like the following example which demonstrates a fatal ambiguity:

    function PointMaker() {
      this.count = 0;
      that = this;

      function Point(x, y) {
        this.x = x;
        this.y = y;
        that.count++;
      }
      Point.prototype.getX = function() { return this.x; }
      Point.prototype.setX = function(x) { this.x = x; }

      this.Point = Point;
    }

// Each PointMaker is a factory for new points that counts how many
points it made:
var pointMaker = new PointMaker();

foo(new (pointMaker.Point)(3,5));

// How many has it made so far?
print(pointMaker.count); // 1

If foo() only uses the point it's been given in the expected ways, all
is cool. But what if foo() says:

function foo(pt) {
  var setX = pt.setX;
  setX(44);
}

or equivalently

function foo(pt) {
  (true && pt.setX)(44);
}

Under Crock's proposed rule, this will bring about the effect of

    pointMaker.x = 44;

even though one would have thought foo() had no access path whatsoever
to pointMaker. In other words, a lexical-this-defaulting rule as Crock
proposed, and as I gather you seem to be proposing, would cause the
'this' in setX to be bound to the lexically enclosing 'this' when setX
is called as a function. This violates it's authors intent in the same
way that the non-strict global-this-capture does. The only difference
is that inappropriate access is being provided to pointMaker rather
than the global object.

If a lexical this-capture rule can be made to work without these kinds
of problems, it would indeed be better than having to explain
".bind(this)" to ES programmers. Suggestions?

-- 
    Cheers,
    --MarkM