'this' is more complicated than I thought (PropertyReferences break equivalences)

# Claus Reinke (14 years ago)

Like most Javascript programmers, I have tended to follow a simple rule for functions using 'this': eta-expand method selections, use .bind, or get into trouble.

Then I got curious about how method calls determine what object to pass as 'this': a method is a function selected from an object, and functions are first-class values, so, by the time they get called, how do we know where they came from?

So I looked into the spec, and things deteriorated from there.

I would be interested in the rationale for the current specification of PropertyReferences, as it seems to invalidate a large class of program equivalences (see below for examples).

Status:

According to the Ecmascript spec (11.2.1), property accessors return not the selected value but a Reference (8.7), which is a combination of the object selected from and the name of the property being selected (the property value is not stored immediately, but selected later when calling GetValue on such a PropertyReference).

Function calls (11.2.3) then construct 'this' from the object in a PropertyReference, and the whole Reference concept, as far as its use for 'this' is concerned, seems finely tuned to mimic a piece of syntax (conserve the base object info just long enough to use it for 'this'), rather than a semantic value (trying to pass around PropertyReferences is likely to end up calling GetValue, losing the reference info).

Question 1: Should a Reference hold on to the current property value?

Currently, there seems to be no way to store a property
Reference without GetValue getting called, so there is no 
window for changing the property value such a Reference
refers to behind its back. That would no longer be true if
References could be passed around, as language values.

Question 2: If a Reference allows us to recover 'obj' from 'obj.method', why does this information have to get lost when passing it through a variable binding?

obj.longish()    // correct 'this'
var short = obj.longish; // try to define a shorthand
short(); // oops, wrong 'this'

This seems to be a very popular mistake - most beginners
seem to get burned once. Naively, one thinks of selection
losing the information, but that does not seem to be the
case. So could this error source be eliminated by passing 
the Reference as a value, instead of only the value component
without the base object, one step further?

Question 3: It seems that trying to reuse References for 'this' forces early calls to GetValue (because users should not have to call 'obj.method.valueOf()' or 'obj.property.valueOf()' to trigger the delayed selection, and because the property value is not stored in the PropertyReference).

This loses information that we would like to hold on to - 
if we really cannot solve this for References in general,
why not store the 'this'-candidate in the function instead
(similar to the fairly new '[[boundThis]]')?

That might allow to limit the equivalence breakage with
respect to determining 'this'.

Currently, searching for 'Reference' in the spec gives an uneven picture. For instance, the spec claims that

The Reference Specification Type is used to explain the 
behaviour of such operators as delete, typeof, and the 
assignment operators

However, References are also used to determine the value of 'this' for function calls, and most expressions/operations/ variable bindings/function calls cannot pass through References, returning only their values instead. This information is spread over too many pages - it could be summarized in the section on References.

Question 4: (general version of question 2) Why is the origin information in References lost so easily?

It seems that most parts of the spec require GetValue() by
default, with few exceptions. What would go wrong if the
available information would be passed on instead (isn't
it sufficient for the final consumers to call GetValue(),
provided that the original property value is stored in the
PropertyReference, to avoid interference)?

Broken equivalences:

It is not too surprising that eta-conversion does not hold

obj.method <-/-> function() { return obj.method(); }

although usually, the problems are with termination, side-effects, or type errors, while in this case, the problem seems to be context-sensitive: many contexts treat References differently. That can be confusing.

Currently, quite a few code transformations are not valid if the code involves References and might be used in the context of a function call (MemberExpression/CallExpression). Here are some examples:

var x = obj.m; x(); <-/-> obj.m();

obj.m.valueOf() <-/-> obj.m

(function(){ return obj.m; }()) <-/-> obj.m

(x = obj.m) <-/-> obj.m    
    // where x is unused

[obj.m][0] <-/-> obj.m    
    // obvious in hindsight? 
    // but really surprising the first time

{tmp: obj.m}.tmp <-/-> obj.m
    // this explains the one above

(0,obj.m) <-/-> obj.m

(true && obj.m) <-/-> obj.m

(false || obj.m) <-/-> obj.m

(true ? obj.m : obj.m) <-/-> obj.m    
    // Firefox 3.6.11 wrongly optimizes this one

I find the result very confusing: not only will a method lose its Reference just by passing it around, but it will pick up a new Reference by passing it through any kind of Object. This picking-up-new-Reference is probably needed for mixins (copying methods from one object to another), but it means that storing naked method References in Arrays is not recommended.

I was not aware that just about any code transformation would be invalidated by the handling of References. If this cannot be fixed, could the specification be more explicit about this, please?

Claus

Like most Javascript programmers, I have tended to follow 
a simple rule for functions using 'this': eta-expand method 
selections, use .bind, or get into trouble.

Then I got curious about how method calls determine what
object to pass as 'this': a method is a function selected from 
an object, and functions are first-class values, so, by the time 
they get called, how do we know where they came from?

So I looked into the spec, and things deteriorated from there.

I would be interested in the rationale for the current specification 
of PropertyReferences, as it seems to invalidate a large class of
program equivalences (see below for examples).

Status:

According to the Ecmascript spec (11.2.1), property accessors 
return not the selected value but a Reference (8.7), which is a 
combination of the object selected from and the name of
the property being selected (the property value is not stored
immediately, but selected later when calling GetValue on 
such a PropertyReference).

Function calls (11.2.3) then construct 'this' from the object
in a PropertyReference, and the whole Reference concept,
as far as its use for 'this' is concerned, seems finely tuned 
to mimic a piece of syntax (conserve the base object info
just long enough to use it for 'this'), rather than a semantic 
value (trying to pass around PropertyReferences is likely
to end up calling GetValue, losing the reference info).

Question 1: 
    Should a Reference hold on to the current property value?

    Currently, there seems to be no way to store a property
    Reference without GetValue getting called, so there is no 
    window for changing the property value such a Reference
    refers to behind its back. That would no longer be true if
    References could be passed around, as language values.

Question 2:
    If a Reference allows us to recover 'obj' from 'obj.method',
    why does this information have to get lost when passing 
    it through a variable binding?

    obj.longish()    // correct 'this'
    var short = obj.longish; // try to define a shorthand
    short(); // oops, wrong 'this'

    This seems to be a very popular mistake - most beginners
    seem to get burned once. Naively, one thinks of selection
    losing the information, but that does not seem to be the
    case. So could this error source be eliminated by passing 
    the Reference as a value, instead of only the value component
    without the base object, one step further?

Question 3:
    It seems that trying to reuse References for 'this' forces
    early calls to GetValue (because users should not have to
    call 'obj.method.valueOf()' or 'obj.property.valueOf()' to 
    trigger the delayed selection, and because the property
    value is not stored in the PropertyReference).

    This loses information that we would like to hold on to - 
    if we really cannot solve this for References in general,
    why not store the 'this'-candidate in the function instead
    (similar to the fairly new '[[boundThis]]')?

    That might allow to limit the equivalence breakage with
    respect to determining 'this'.

Currently, searching for 'Reference' in the spec gives an 
uneven picture. For instance, the spec claims that 

    The Reference Specification Type is used to explain the 
    behaviour of such operators as delete, typeof, and the 
    assignment operators

However, References are also used to determine the value
of 'this' for function calls, and most expressions/operations/
variable bindings/function calls cannot pass through 
References, returning only their values instead. This 
information is spread over too many pages - it could be 
summarized in the section on References.

Question 4: (general version of question 2)
    Why is the origin information in References lost so easily?

    It seems that most parts of the spec require GetValue() by
    default, with few exceptions. What would go wrong if the
    available information would be passed on instead (isn't
    it sufficient for the final consumers to call GetValue(),
    provided that the original property value is stored in the
    PropertyReference, to avoid interference)?

Broken equivalences:

It is not too surprising that eta-conversion does not hold

    obj.method <-/-> function() { return obj.method(); }

although usually, the problems are with termination, 
side-effects, or type errors, while in this case, the problem
seems to be context-sensitive: many contexts treat References
differently. That can be confusing.

Currently, quite a few code transformations are not valid
if the code involves References and might be used in the
context of a function call (MemberExpression/CallExpression).
Here are some examples:

    var x = obj.m; x(); <-/-> obj.m();

    obj.m.valueOf() <-/-> obj.m

    (function(){ return obj.m; }()) <-/-> obj.m

    (x = obj.m) <-/-> obj.m    
        // where x is unused

    [obj.m][0] <-/-> obj.m    
        // obvious in hindsight? 
        // but really surprising the first time

    {tmp: obj.m}.tmp <-/-> obj.m
        // this explains the one above

    (0,obj.m) <-/-> obj.m

    (true && obj.m) <-/-> obj.m

    (false || obj.m) <-/-> obj.m

    (true ? obj.m : obj.m) <-/-> obj.m    
        // Firefox 3.6.11 wrongly optimizes this one

I find the result very confusing: not only will a method
lose its Reference just by passing it around, but it will
pick up a new Reference by passing it through any kind 
of Object. This picking-up-new-Reference is probably
needed for mixins (copying methods from one object
to another), but it means that storing naked method 
References in Arrays is not recommended.

I was not aware that just about any code transformation
would be invalidated by the handling of References. 
If this cannot be fixed, could the specification be more 
explicit about this, please? 

Claus

# Lasse Reichstein (14 years ago)

The behavior of References isn't as arbitrary or different from other
languages as it might seem. It's really a way to specify l-values.

When you assign to an object property, e.g., "o.x = 42", the l-value here
is the "x" property of the "o" object. We need to capture both.

In other languages, e.g., C or Java, you have the same problem: int x = something.prop; x = 10; is not the same as something.prop = 10; In C/C++, you know that the something.prop on the right-hand side of an
assignment means something slightly different from the one on the left-hand side. We have the same thing in ECMAScript: var x = something.prop; x = 10; is not the same as something.prop = 10;

So far, References as a specification mechanism is just following other
languages, and behaving exactly as any seasoned programmer would expect. Try checking
your questions against the expected behavior if a Reference is just an l-expression.

Where it differs is function/method calls. The traditional languages that
I compared to does not have functions as first-class values. If a function sits on an
object, you can't extract it and call it without such an object.

Well, there is C++ method pointers which stay bound to the object they
were extracted from (but don't try to guess the size of one, they are probably bigger than
whatever you might think is necessary for that). And they are different from pointers
to static functions.

The binding of "this" when calling a Reference value mimics method calls.
It does so fine when you treat objects as objects, but not when you try to extract a
method from its object (try doing that in Java!). It's a shallow abstraction, but it does work
when you play along with it.

A Reference is purely a specification-tool that desn't have to exist in
any form inside an actual implementation. If we start exposing it, we would require
implementations to take steps they might not need in order to visibly create and pass around such
a reference.

If you really need user-level references, you can create them yourself,
and just do var ref = new Reference(object, "prop"); var val = ref.GetValue(); ref.SetValue(val + 10); ref.SetValue(someFunction); ref.call(arg1, arg2);

On Mon, 11 Apr 2011 10:55:49 +0200, Claus Reinke <claus.reinke at talk21.com>

wrote:

Like most Javascript programmers, I have tended to follow a simple rule
for functions using 'this': eta-expand method selections, use .bind, or
get into trouble.

Then I got curious about how method calls determine what object to pass as 'this': a method is a function selected from an
object, and functions are first-class values, so, by the time they get
called, how do we know where they came from?

So I looked into the spec, and things deteriorated from there.

I would be interested in the rationale for the current specification of
PropertyReferences, as it seems to invalidate a large class of program equivalences (see below for examples).

Status:

According to the Ecmascript spec (11.2.1), property accessors return not
the selected value but a Reference (8.7), which is a combination of the
object selected from and the name of the property being selected (the property value is not stored immediately, but selected later when calling GetValue on such a
PropertyReference).

Function calls (11.2.3) then construct 'this' from the object in a PropertyReference, and the whole Reference concept, as far as its use for 'this' is concerned, seems finely tuned to mimic a
piece of syntax (conserve the base object info just long enough to use it for 'this'), rather than a semantic value
(trying to pass around PropertyReferences is likely to end up calling GetValue, losing the reference info).

Question 1: Should a Reference hold on to the current property value?
Currently, there seems to be no way to store a property
Reference without GetValue getting called, so there is no     window  
for changing the property value such a Reference refers to behind its back. That would no longer be true if References could be passed around, as language values.

Which value? The one the property (if it existed) had when the reference was created? Or the current one - if the Reference survives for any amount of time, the object property could change its value in the meantime. What if it's getter property? What if it's a setter property with not getter?

If you make a reference a first-class value, then you probably don't want to make too many assumptions about how it's used. Don't read a value from
it unless the user wants to do so.

Question 2: If a Reference allows us to recover 'obj' from 'obj.method', why does this information have to get lost when passing it
through a variable binding?
obj.longish()    // correct 'this'
var short = obj.longish; // try to define a shorthand
short(); // oops, wrong 'this'

This seems to be a very popular mistake - most beginners
seem to get burned once. Naively, one thinks of selection
losing the information, but that does not seem to be the
case. So could this error source be eliminated by passing     the  
Reference as a value, instead of only the value component without the base object, one step further?

Ofcourse it's possible, but personally I prefer to have the "this" object obvious in the call line. That way I know what object the method is being called on. Without it, the loss of context is in the source code, making it harder to read and maintain.

Question 3: It seems that trying to reuse References for 'this' forces early calls to GetValue (because users should not have to call 'obj.method.valueOf()' or 'obj.property.valueOf()' to
trigger the delayed selection, and because the property value is not stored in the PropertyReference).

I'm not sure I understand what the problem is here.

This loses information that we would like to hold on to -     if we  
really cannot solve this for References in general, why not store the 'this'-candidate in the function instead (similar to the fairly new '[[boundThis]]')?

Won't work. The same function can be used in many places at the same time. E.g. [obj1.foo, obj2.foo](Math.random() * 2) | 0;

That might allow to limit the equivalence breakage with
respect to determining 'this'.
Currently, searching for 'Reference' in the spec gives an uneven
picture. For instance, the spec claims that The Reference
Specification Type is used to explain the behaviour of such
operators as delete, typeof, and the assignment operators

However, References are also used to determine the value of 'this' for function calls, and most expressions/operations/ variable bindings/function calls cannot pass through References,
returning only their values instead. This information is spread over too
many pages - it could be summarized in the section on References.

Question 4: (general version of question 2) Why is the origin information in References lost so easily?
It seems that most parts of the spec require GetValue() by
default, with few exceptions. What would go wrong if the
available information would be passed on instead (isn't
it sufficient for the final consumers to call GetValue(),
provided that the original property value is stored in the
PropertyReference, to avoid interference)?

References is a specification tool. If it survived for an extended amount
of time, and visibly so, implementations would have to actually implement something
to represent it. As it is now, a reference is found and immediately consumed,
which allows implementations to never create it at all, and work directly on the
value in r-value contexts, and on the object and property in l-value contexts.

Broken equivalences:

It is not too surprising that eta-conversion does not hold
obj.method <-/-> function() { return obj.method(); }
although usually, the problems are with termination, side-effects, or
type errors, while in this case, the problem seems to be context-sensitive: many contexts treat References differently. That can be confusing.

Currently, quite a few code transformations are not valid if the code involves References and might be used in the context of a function call (MemberExpression/CallExpression). Here are some examples:
var x = obj.m; x(); <-/-> obj.m();

obj.m.valueOf() <-/-> obj.m

Why should that work? The valueOf function isn't guaranteed to return anything related to the object it's on.

(function(){ return obj.m; }()) <-/-> obj.m

(x = obj.m) <-/-> obj.m            // where x is unused

[obj.m][0] <-/-> obj.m            // obvious in hindsight?  
    // but really surprising the first time

{tmp: obj.m}.tmp <-/-> obj.m
    // this explains the one above

(0,obj.m) <-/-> obj.m

(true && obj.m) <-/-> obj.m

(false || obj.m) <-/-> obj.m

(true ? obj.m : obj.m) <-/-> obj.m            // Firefox 3.6.11  
wrongly optimizes this one

I find the result very confusing: not only will a method lose its Reference just by passing it around, but it will pick up a new Reference by passing it through any kind of Object. This
picking-up-new-Reference is probably needed for mixins (copying methods from one object to another), but it means that storing naked method References in Arrays
is not recommended.

This is exactly correct. Extracting a method from its object will break the connection to the object. Which is kindof expected when you allow any function to be used as both a method and a non-method.

I was not aware that just about any code transformation would be invalidated by the handling of References. If this cannot be
fixed, could the specification be more explicit about this, please?

When I think of References as l-values, the current behavior actually
become the expected one. The only tricky bit is that method-calls actually need
an l-value to work correctly.

The behavior of References isn't as arbitrary or different from other  
languages as it might seem.
It's really a way to specify l-values.

When you assign to an object property, e.g., "o.x = 42", the l-value here  
is the "x" property of
the "o" object. We need to capture both.

In other languages, e.g., C or Java, you have the same problem:
   int x = something.prop;
   x = 10;
is not the same as
   something.prop = 10;
In C/C++, you know that the something.prop on the right-hand side of an  
assignment
means something slightly different from the one on the left-hand side.
We have the same thing in ECMAScript:
   var x = something.prop;
   x = 10;
is not the same as
   something.prop = 10;

So far, References as a specification mechanism is just following other  
languages,
and behaving exactly as any seasoned programmer would expect. Try checking  
your questions
against the expected behavior if a Reference is just an l-expression.

Where it differs is function/method calls. The traditional languages that  
I compared
to does not have functions as first-class values. If a function sits on an  
object,
you can't extract it and call it without such an object.

Well, there is C++ method pointers which stay bound to the object they  
were extracted from
(but don't try to guess the size of one, they are probably bigger than  
whatever you
might think is necessary for that). And they are different from pointers  
to static
functions.

The binding of "this" when calling a Reference value mimics method calls.  
It does so
fine when you treat objects as objects, but not when you try to extract a  
method from its object
(try doing that in Java!). It's a shallow abstraction, but it does work  
when you play along
with it.

A Reference is purely a specification-tool that desn't have to exist in  
any form inside
an actual implementation. If we start exposing it, we would require  
implementations to take
steps they might not need in order to visibly create and pass around such  
a reference.

If you really need user-level references, you can create them yourself,  
and just do
   var ref = new Reference(object, "prop");
   var val = ref.GetValue();
   ref.SetValue(val + 10);
   ref.SetValue(someFunction);
   ref.call(arg1, arg2);

/L

On Mon, 11 Apr 2011 10:55:49 +0200, Claus Reinke <claus.reinke at talk21.com>  
wrote:

> Like most Javascript programmers, I have tended to follow a simple rule  
> for functions using 'this': eta-expand method selections, use .bind, or  
> get into trouble.
>
> Then I got curious about how method calls determine what
> object to pass as 'this': a method is a function selected from an  
> object, and functions are first-class values, so, by the time they get  
> called, how do we know where they came from?
>
> So I looked into the spec, and things deteriorated from there.
>
> I would be interested in the rationale for the current specification of  
> PropertyReferences, as it seems to invalidate a large class of
> program equivalences (see below for examples).
>
> Status:
>
> According to the Ecmascript spec (11.2.1), property accessors return not  
> the selected value but a Reference (8.7), which is a combination of the  
> object selected from and the name of
> the property being selected (the property value is not stored
> immediately, but selected later when calling GetValue on such a  
> PropertyReference).
>
> Function calls (11.2.3) then construct 'this' from the object
> in a PropertyReference, and the whole Reference concept,
> as far as its use for 'this' is concerned, seems finely tuned to mimic a  
> piece of syntax (conserve the base object info
> just long enough to use it for 'this'), rather than a semantic value  
> (trying to pass around PropertyReferences is likely
> to end up calling GetValue, losing the reference info).
>
> Question 1:     Should a Reference hold on to the current property value?
>
>     Currently, there seems to be no way to store a property
>     Reference without GetValue getting called, so there is no     window  
> for changing the property value such a Reference
>     refers to behind its back. That would no longer be true if
>     References could be passed around, as language values.

Which value? The one the property (if it existed) had when the reference
was created? Or the current one - if the Reference survives for any amount
of time, the object property could change its value in the meantime.
What if it's getter property?
What if it's a setter property with not getter?

If you make a reference a first-class value, then you probably don't want
to make too many assumptions about how it's used. Don't read a value from  
it
unless the user wants to do so.

> Question 2:
>     If a Reference allows us to recover 'obj' from 'obj.method',
>     why does this information have to get lost when passing     it  
> through a variable binding?
>
>     obj.longish()    // correct 'this'
>     var short = obj.longish; // try to define a shorthand
>     short(); // oops, wrong 'this'
>
>     This seems to be a very popular mistake - most beginners
>     seem to get burned once. Naively, one thinks of selection
>     losing the information, but that does not seem to be the
>     case. So could this error source be eliminated by passing     the  
> Reference as a value, instead of only the value component
>     without the base object, one step further?

Ofcourse it's possible, but personally I prefer to have the "this" object
obvious in the call line. That way I know what object the method is being
called on. Without it, the loss of context is in the source code, making it
harder to read and maintain.

>
> Question 3:
>     It seems that trying to reuse References for 'this' forces
>     early calls to GetValue (because users should not have to
>     call 'obj.method.valueOf()' or 'obj.property.valueOf()' to  
>     trigger the delayed selection, and because the property
>     value is not stored in the PropertyReference).

I'm not sure I understand what the problem is here.

>
>     This loses information that we would like to hold on to -     if we  
> really cannot solve this for References in general,
>     why not store the 'this'-candidate in the function instead
>     (similar to the fairly new '[[boundThis]]')?

Won't work. The same function can be used in many places at the same time.
E.g.
  [obj1.foo, obj2.foo][(Math.random() * 2) | 0]();

>
>     That might allow to limit the equivalence breakage with
>     respect to determining 'this'.
>
> Currently, searching for 'Reference' in the spec gives an uneven  
> picture. For instance, the spec claims that     The Reference  
> Specification Type is used to explain the     behaviour of such  
> operators as delete, typeof, and the     assignment operators
>
> However, References are also used to determine the value
> of 'this' for function calls, and most expressions/operations/
> variable bindings/function calls cannot pass through References,  
> returning only their values instead. This information is spread over too  
> many pages - it could be summarized in the section on References.
>
> Question 4: (general version of question 2)
>     Why is the origin information in References lost so easily?
>
>     It seems that most parts of the spec require GetValue() by
>     default, with few exceptions. What would go wrong if the
>     available information would be passed on instead (isn't
>     it sufficient for the final consumers to call GetValue(),
>     provided that the original property value is stored in the
>     PropertyReference, to avoid interference)?

References is a specification tool. If it survived for an extended amount  
of time,
and visibly so, implementations would have to actually implement something  
to
represent it. As it is now, a reference is found and immediately consumed,  
which
allows implementations to never create it at all, and work directly on the  
value
in r-value contexts, and on the object and property in l-value contexts.

>
> Broken equivalences:
>
> It is not too surprising that eta-conversion does not hold
>
>     obj.method <-/-> function() { return obj.method(); }
>
> although usually, the problems are with termination, side-effects, or  
> type errors, while in this case, the problem
> seems to be context-sensitive: many contexts treat References
> differently. That can be confusing.
>
> Currently, quite a few code transformations are not valid
> if the code involves References and might be used in the
> context of a function call (MemberExpression/CallExpression).
> Here are some examples:
>
>     var x = obj.m; x(); <-/-> obj.m();
>
>     obj.m.valueOf() <-/-> obj.m

Why should that work? The valueOf function isn't guaranteed to return
anything related to the object it's on.

>
>     (function(){ return obj.m; }()) <-/-> obj.m
>
>     (x = obj.m) <-/-> obj.m            // where x is unused
>
>     [obj.m][0] <-/-> obj.m            // obvious in hindsight?  
>         // but really surprising the first time
>
>     {tmp: obj.m}.tmp <-/-> obj.m
>         // this explains the one above
>
>     (0,obj.m) <-/-> obj.m
>
>     (true && obj.m) <-/-> obj.m
>
>     (false || obj.m) <-/-> obj.m
>
>     (true ? obj.m : obj.m) <-/-> obj.m            // Firefox 3.6.11  
> wrongly optimizes this one
>
> I find the result very confusing: not only will a method
> lose its Reference just by passing it around, but it will
> pick up a new Reference by passing it through any kind of Object. This  
> picking-up-new-Reference is probably
> needed for mixins (copying methods from one object
> to another), but it means that storing naked method References in Arrays  
> is not recommended.

This is exactly correct. Extracting a method from its object will break the
connection to the object.
Which is kindof expected when you allow any function to be used as both a
method and a non-method.

> I was not aware that just about any code transformation
> would be invalidated by the handling of References. If this cannot be  
> fixed, could the specification be more explicit about this, please?

When I think of References as l-values, the current behavior actually  
become
the expected one. The only tricky bit is that method-calls actually need  
an l-value to
work correctly.

/L
-- 
Lasse Reichstein - reichsteinatwork at gmail.com

# Garrett Smith (14 years ago)

On 4/11/11, Claus Reinke <claus.reinke at talk21.com> wrote:

Like most Javascript programmers, I have tended to follow a simple rule for functions using 'this': eta-expand method selections, use .bind, or get into trouble.

That is unnecessary and inefficient. Instead, I use the following algorithm:

For instance methods, always call with the base object or with call/apply. Don't use this in methods that are to be called as static, so you can use variable shortcuts for those static methods, pass them around.

// DONT DO THIS var StyleUtils = { HAS_COMPUTED_STYLE : (function() { /.../ return true; })(), getStyle : function(el, name) { // FAILED STRATEGY, this in static context. if(this.HAS_COMPUTED_STYLE) { return "worked"; } return "didn't work"; } };

That most JavaScript programmers like to bind every function says more about trends in JavaScript programming than about JavaScript.

Then I got curious about how method calls determine what object to pass as 'this': a method is a function selected from an object, and functions are first-class values, so, by the time they get called, how do we know where they came from?

The base object.

var o = { m : function(){ alert( this == o ); } }; o.m(); // true, o is base object

var f = o.m; f(); // false.

Calling f() results false because the base object is a declarative environment record (called VariableObject in ES3). And when that happens, the this value is either global object or null in ES5 in some cases.

On 4/11/11, Claus Reinke <claus.reinke at talk21.com> wrote:
> Like most Javascript programmers, I have tended to follow
> a simple rule for functions using 'this': eta-expand method
> selections, use .bind, or get into trouble.
>
That is unnecessary and inefficient. Instead, I use the following algorithm:

For instance methods, always call with the base object or with
call/apply. Don't use `this` in methods that are to be called as
static, so you can use variable shortcuts for those static methods,
pass them around.

// DONT DO THIS
var StyleUtils = {
  HAS_COMPUTED_STYLE : (function() { /*...*/ return true; })(),
  getStyle : function(el, name) {
    // FAILED STRATEGY, `this` in static context.
    if(this.HAS_COMPUTED_STYLE) {
      return "worked";
    }
    return "didn't work";
  }
};

That most JavaScript programmers like to bind every function says more
about trends in JavaScript programming than about JavaScript.

> Then I got curious about how method calls determine what
> object to pass as 'this': a method is a function selected from
> an object, and functions are first-class values, so, by the time
> they get called, how do we know where they came from?
>
The base object.

var o = {
  m : function(){ alert( this == o ); }
};
o.m(); // true, o is base object

var f = o.m;
f(); // false.

Calling f() results false because the base object is a declarative
environment record (called VariableObject in ES3). And when that
happens, the `this` value is either global object or  null in ES5 in
some cases.
-- 
Garrett

# Garrett Smith (14 years ago)

On 4/11/11, Claus Reinke <claus.reinke at talk21.com> wrote:

Like most Javascript programmers, I have tended to follow a simple rule for functions using 'this': eta-expand method selections, use .bind, or get into trouble.

That is unnecessary, inefficient, and adds clutter.

That most JavaScript programmers do that says more about trends in JavaScript programming than it does about the language.

Then I got curious about how method calls determine what object to pass as 'this': a method is a function selected from an object, and functions are first-class values, so, by the time they get called, how do we know where they came from?

The base object.

Follow these two rules to greatly reduce this reference confusion.

For instance methods (such as prototype methods), either qualify the method call with the base object or use call/apply. For example, var x = new X; x.m(); // Qualified instance method // DONT DO THIS var m = x.m; m();
For static methods, write them so that they never use this. For example, here is an example of static method getStyle that violates that rule and uses this:

// DONT DO THIS var StyleUtils = { HAS_COMPUTED_STYLE : (function() { /.../})(), getStyle : function(el, name) { // Problem: Use of this in static method. if(this.HAS_COMPUTED_STYLE) { } } };

By never using this in static methods, it can be assured that they can be aliased with a variable and passed around, e.g. var getStyle = StyleUtils.getStyle.

On 4/11/11, Claus Reinke <claus.reinke at talk21.com> wrote:
> Like most Javascript programmers, I have tended to follow
> a simple rule for functions using 'this': eta-expand method
> selections, use .bind, or get into trouble.
>
That is unnecessary, inefficient, and adds clutter.

That most JavaScript programmers do that says more about trends in
JavaScript programming than it does about the language.

> Then I got curious about how method calls determine what
> object to pass as 'this': a method is a function selected from
> an object, and functions are first-class values, so, by the time
> they get called, how do we know where they came from?
>
The base object.

Follow these two rules to greatly reduce `this` reference confusion.

1. For instance methods (such as prototype methods), either qualify
the method call with the base object or use call/apply. For example,
var x = new X;
x.m(); // Qualified instance method
// DONT DO THIS
var m = x.m;
m();

2. For static methods, write them so that they never use `this`. For
example, here is an example of static method `getStyle` that violates
that rule and uses `this`:

// DONT DO THIS
var StyleUtils = {
  HAS_COMPUTED_STYLE : (function() { /*...*/})(),
  getStyle : function(el, name) {
    // Problem: Use of `this` in static method.
    if(this.HAS_COMPUTED_STYLE) {
    }
  }
};

By never using `this` in static methods, it can be assured that they
can be aliased with a variable and passed around, e.g. `var getStyle =
StyleUtils.getStyle`.
-- 
Garrett

# Claus Reinke (14 years ago)

The behavior of References isn't as arbitrary or different from other languages as it might seem. It's really a way to specify l-values.

Not arbitrary, but different, and quite drastically so (as far as usage is concerned). Your remarks helped me to pin down the difference (and eliminated two of my questions, thanks!-): References are l-values, but they cannot be used as such, due to forced, implicit conversion curtailing their lifetimes.

The differences between References and general l-values (values that represent locations where other values may be stored) lies in how they may be used and how long they live:

l-values are first-class values: they can be passed around, assigned to variables, stored in data structures; they happen to support a de-referencing operation, but merely evaluating an l-value does not de-reference it; l-values can be de-referenced explicitly; some languages implicitly coerce l-values into r-values (causing de-reference) depending on usage context (this is where the names come from: values on the left and right hand sides of assignments), but even those languages tend to provide means to control when coercions take place
References start out as l-values, but don't live long enough to be used as such. They cannot be passed around, stored in data structures, or assigned to variables; any attempt to evaluate them leads to immediate de-reference, no matter whether the usage context expects an l-value or not; there is no way to prevent the implicit de-reference

It is mostly the implicit coercion in evaluation, combined with the early evaluation inherent in Javascript's call-by-value semantics, that breaks those equivalences.

In other languages, e.g., C or Java, you have the same problem: int x = something.prop; x = 10; is not the same as something.prop = 10;

My C has been buried for too long, but were not l-values one of C's showcases? Something like this

int *x = &(something.prop);
*x = 10;

should work, by making x hold l-values (in case I messed up the syntax beyond recognition: x should be a pointer to int, its value being the address of something.prop, so we can use the r-value of x as an l-value in the second assignment).

C allowed us to be explicit about whether we wanted l-values or r-values, overriding the default conversions when necessary. Some later languages, such as Haskell or Standard ML, dropped the implicit coercions entirely, so all de-references are explicit.

ECMAScript relies on implicit de-reference, but triggers that by every evaluation. So we don't have explicit de-reference, we do not have C's flexibility for explicit de-reference control, and we do not even have C's context-sensitive implicit de-reference.

Which means that things like

(1 ? obj.prop : obj.prop) = 3; (0, obj.prop) = 2;

will work in C, but fail in ES (References are very short-lived l-values - every operation evaluates and de-references them, independent of whether the result is going to be used in an l-value or r-value context).

Also, in C we can write

x = &(obj.prop); *x = 4;

to express that we want x to hold l-values, and storing l-values in arrays isn't much different

int *a[1] = { &obj.prop }; *(a[0]) = 1;

In ES, we only have the default-to-r-value path. For instance,

[obj.prop][0] = 1;

will not use obj.prop as an l-value, and there does not seem to be a straightforward alternative for programmers who want to work with ES References as l-values.

So far, References as a specification mechanism is just following other languages, and behaving exactly as any seasoned programmer would expect.

Does the above explain why a seasoned programmer might reasonably expect differently, because ECMAScript behaves differently from other languages?

The binding of "this" when calling a Reference value mimics method calls. It does so fine when you treat objects as objects, but not when you try to extract a method from its object .. It's a shallow abstraction, but it does work when you play along with it.

I was trying to point out that the mechanism works for the simple case and is known to confuse programmers for other cases. In particular, I am trying to find out whether the current mechanism is a special case of a more complete mechanism, one that works equally well for simple and non-simple cases.

Since PropertyReferences hold the object the method was selected from, all that seems needed is to make References survive evalutation, ie, make References first-class values.

An alternative would be to preserve context-information during evaluation: if the result of '(?:)' is to be used as an l-value, then evaluation should perhaps not de-reference the l-values in the conditional.

I am less concerned with being able to use '(?:)' on the left hand side of assignments, and more with being able to use equivalences like '(true ? x : x) <--> x', independent

of where the expression occurs. Since we can write such conditionals on left hand sides, why not make sure that they actually work there?

Question 1: Should a Reference hold on to the current property value?

Which value? The one the property (if it existed) had when the reference was created? Or the current one - if the Reference survives for any amount of time, the object property could change its value in the meantime.

Thanks. So the answer probably is 'no' - which value we get depends on when we look behind the reference.

I guess I was confused by property accessors not actually accessing the property - once the decision is made to return a Reference instead of the property value, it would only be consequent to keep Reference construction and de-reference separated. As long as we are able to specify when to pass the reference and when to look up the value behind it.

Question 2: If a Reference allows us to recover 'obj' from 'obj.method', why does this information have to get lost when passing it through a variable binding? .. could this error source be eliminated by passing the
Reference as a value, instead of only the value component without the base object, one step further?

Of course it's possible, but personally I prefer to have the "this" object obvious in the call line. That way I know what object the method is being called on. Without it, the loss of context is in the source code, making it harder to read and maintain.

I'm afraid you won't get that comfort;-) At the moment, programmers can just write the more complicated

var short = function(x) { return obj.method(x); };
short("hi"); // no 'this' object on the call line

We can try to make this more readable, and we can try to eliminate a common source of bugs, but the rest is between you and your team's coding style and style checker.

Question 3: It seems that trying to reuse References for 'this' forces early calls to GetValue (because users should not have to call 'obj.method.valueOf()' or 'obj.property.valueOf()' to
trigger the delayed selection, and because the property value is not stored in the PropertyReference).

I'm not sure I understand what the problem is here.

Probably because the description is a bit confusing. I was trying to understand why References get eliminated early, through calls to GetValue, and was enumerating non-reasons before coming to my question:

This loses information that we would like to hold on to -
if we really cannot solve this for References in general, why not store the 'this'-candidate in the function instead (similar to the fairly new '[[boundThis]]')?

Won't work. The same function can be used in many places at the same time.

Ah, good point. If functions were constants, we could make copies (sharing the code, but with different this values). But they aren't, so we need the 'this'-candidates outside the function.

E.g. [obj1.foo, obj2.foo](Math.random() * 2) | 0;

Note that, currently, either selection will have that anonymous first array as 'this', not obj1 or obj2. But you were aware of that, right?-)

Question 4: (general version of question 2) Why is the origin information in References lost so easily?
It seems that most parts of the spec require GetValue() by
default, with few exceptions. What would go wrong if the
available information would be passed on instead (isn't
it sufficient for the final consumers to call GetValue(),
provided that the original property value is stored in the
PropertyReference, to avoid interference)?
References is a specification tool. If it survived for an extended amount of time, and visibly so, implementations would have to actually implement something to represent it. As it is now, a reference is found and immediately consumed, which allows implementations to never create it at all, and work directly on the value in r-value contexts, and on the object and property in l-value contexts.

Yes, and that is my argument. L-values as first-class values is the common way to handle references, whether it is in C, in Haskell, in ML, .., ever since Strachey documented l-values in 1967, and probably longer than that. Once references become visible to programmers, one might as well support them fully. Eliminating temporary structures is a common implementation optimization, not limited to References, and not a language spec concern.

Broken equivalences: ..
var x = obj.m; x(); <-/-> obj.m();

obj.m.valueOf() <-/-> obj.m
Why should that work? The valueOf function isn't guaranteed to return anything related to the object it's on.

The default valueOf for Function comes from Object, where it is 'ToObject this', which for Object is the input argument without conversion. I think..

This is exactly correct. Extracting a method from its object will break the connection to the object. Which is kind of expected when you allow any function to be used as both a method and a non-method.

I expected none of these:

Property access is not extraction.
Extraction is triggered by constructs that could just as well pass on the property accessor (and probably should, as the alternative leads to runtime errors).
Extraction alone will not break the connection, only some forms of triggering extractions will do so.
A new connection is established by trying to pass a property accessor through an Array.

I was not aware that just about any code transformation would be invalidated by the handling of References. If this cannot be fixed, could the specification be more explicit about this, please?

When I think of References as l-values, the current behavior actually become the expected one. The only tricky bit is that method-calls actually need an l-value to work correctly.

Even the l-value part is unusual, as I've tried to show.

Claus

> The behavior of References isn't as arbitrary or different from 
> other languages as it might seem. It's really a way to specify 
> l-values.

Not arbitrary, but different, and quite drastically so (as far as
usage is concerned). Your remarks helped me to pin down the 
difference (and eliminated two of my questions, thanks!-): 
References are l-values, but they cannot be used as such, 
due to forced, implicit conversion curtailing their lifetimes.

The differences between References and general l-values 
(_values_ that represent locations where other values may be
stored) lies in how they may be used and how long they live:

- l-values are first-class values: they can be passed around, 
    assigned to variables, stored in data structures; they happen 
    to support a de-referencing operation, but merely evaluating
    an l-value does not de-reference it; l-values can be 
    de-referenced explicitly; some languages implicitly coerce 
    l-values into r-values (causing de-reference) depending on
    usage context (this is where the names come from: values 
    on the left and right hand sides of assignments), but even 
    those languages tend to provide means to control when 
    coercions take place

- References start out as l-values, but don't live long enough
    to be used as such. They cannot be passed around, stored 
    in data structures, or assigned to variables; any attempt to 
    evaluate them leads to immediate de-reference, no matter 
    whether the usage context expects an l-value or not; there
    is no way to prevent the implicit de-reference

It is mostly the implicit coercion in evaluation, combined with
the early evaluation inherent in Javascript's call-by-value 
semantics, that breaks those equivalences.

> In other languages, e.g., C or Java, you have the same problem:
>   int x = something.prop;
>   x = 10;
> is not the same as
>   something.prop = 10;

My C has been buried for too long, but were not l-values
one of C's showcases? Something like this

    int *x = &(something.prop);
    *x = 10;

should work, by making x hold l-values (in case I messed up 
the syntax beyond recognition: x should be a pointer to int,
its value being the address of something.prop, so we can use
the r-value of x as an l-value in the second assignment).

C allowed us to be explicit about whether we wanted l-values
or r-values, overriding the default conversions when necessary.
Some later languages, such as Haskell or Standard ML, dropped
the implicit coercions entirely, so all de-references are explicit.

ECMAScript relies on implicit de-reference, but triggers that by
every evaluation. So we don't have explicit de-reference, we do
not have C's flexibility for explicit de-reference control, and we
do not even have C's context-sensitive implicit de-reference.

Which means that things like

  (1 ? obj.prop : obj.prop) = 3;
  (0, obj.prop) = 2;

will work in C, but fail in ES (References are very short-lived
l-values - every operation evaluates and de-references them,
independent of whether the result is going to be used in an
l-value or r-value context). 

Also, in C we can write

  x = &(obj.prop);
  *x = 4;

to express that we want x to hold l-values, and storing l-values 
in arrays isn't much different

  int *a[1] = { &obj.prop };
  *(a[0]) = 1;

In ES, we only have the default-to-r-value path. For instance,

  [obj.prop][0] = 1; 

will not use obj.prop as an l-value, and there does not seem 
to be a straightforward alternative for programmers who 
want to work with ES References as l-values.
 
> So far, References as a specification mechanism is just 
> following other languages, and behaving exactly as any 
> seasoned programmer would expect. 

Does the above explain why a seasoned programmer
might reasonably expect differently, because ECMAScript 
behaves differently from other languages?

> The binding of "this" when calling a Reference value 
> mimics method calls. It does so fine when you treat objects 
> as objects, but not when you try to extract a method from 
> its object .. It's a shallow abstraction, but it does work when 
> you play along with it.

I was trying to point out that the mechanism works for the
simple case and is known to confuse programmers for other 
cases. In particular, I am trying to find out whether the current 
mechanism is a special case of a more complete mechanism,
one that works equally well for simple and non-simple cases.

Since PropertyReferences hold the object the method was
selected from, all that seems needed is to make References
survive evalutation, ie, make References first-class values.

An alternative would be to preserve context-information
during evaluation: if the result of '(?:)' is to be used as an
l-value, then evaluation should perhaps not de-reference
the l-values in the conditional.

I am less concerned with being able to use '(?:)' on the
left hand side of assignments, and more with being able
to use equivalences like '(true ? x : x) <--> x', independent
of where the expression occurs. Since we can write such
conditionals on left hand sides, why not make sure that
they actually work there?

>> Question 1:     Should a Reference hold on to the current 
>>                        property value?
> 
> Which value? The one the property (if it existed) had when 
> the reference was created? Or the current one - if the 
> Reference survives for any amount of time, the object 
> property could change its value in the meantime.

Thanks. So the answer probably is 'no' - which value we get 
depends on when we look behind the reference. 

I guess I was confused by property accessors not actually 
accessing the property - once the decision is made to return 
a Reference instead of the property value, it would only be
consequent to keep Reference construction and de-reference
separated. As long as we are able to specify when to pass the 
reference and when to look up the value behind it.

>> Question 2:
>>     If a Reference allows us to recover 'obj' from 'obj.method',
>>     why does this information have to get lost when passing 
>>     it through a variable binding?
>>     .. could this error source be eliminated by passing the  
>>     Reference as a value, instead of only the value component
>>     without the base object, one step further?
> 
> Of course it's possible, but personally I prefer to have the 
> "this" object obvious in the call line. That way I know what 
> object the method is being called on. Without it, the loss of 
> context is in the source code, making it harder to read and 
> maintain.

I'm afraid you won't get that comfort;-) At the moment, 
programmers can just write the more complicated

    var short = function(x) { return obj.method(x); };
    short("hi"); // no 'this' object on the call line

We can try to make this more readable, and we can try to
eliminate a common source of bugs, but the rest is between
you and your team's coding style and style checker.
 
>> Question 3:
>>     It seems that trying to reuse References for 'this' forces
>>     early calls to GetValue (because users should not have to
>>     call 'obj.method.valueOf()' or 'obj.property.valueOf()' to  
>>     trigger the delayed selection, and because the property
>>     value is not stored in the PropertyReference).
> 
> I'm not sure I understand what the problem is here.

Probably because the description is a bit confusing. I was
trying to understand why References get eliminated early,
through calls to GetValue, and was enumerating non-reasons
before coming to my question:

>> This loses information that we would like to hold on to -     
>> if we really cannot solve this for References in general,
>> why not store the 'this'-candidate in the function instead
>> (similar to the fairly new '[[boundThis]]')?
> 
> Won't work. The same function can be used in many places 
> at the same time.

Ah, good point. If functions were constants, we could make 
copies (sharing the code, but with different this values). But
they aren't, so we need the 'this'-candidates outside the function.

> E.g.
>  [obj1.foo, obj2.foo][(Math.random() * 2) | 0]();

Note that, currently, either selection will have that 
anonymous first array as 'this', not obj1 or obj2. But
you were aware of that, right?-)
 
>> Question 4: (general version of question 2)
>>     Why is the origin information in References lost so easily?
>>
>>     It seems that most parts of the spec require GetValue() by
>>     default, with few exceptions. What would go wrong if the
>>     available information would be passed on instead (isn't
>>     it sufficient for the final consumers to call GetValue(),
>>     provided that the original property value is stored in the
>>     PropertyReference, to avoid interference)?
> 
> References is a specification tool. If it survived for an extended 
> amount of time, and visibly so, implementations would have 
> to actually implement something to represent it. As it is now, 
> a reference is found and immediately consumed, which allows 
> implementations to never create it at all, and work directly on 
> the value in r-value contexts, and on the object and property 
> in l-value contexts.

Yes, and that is my argument. L-values as first-class values is the 
common way to handle references, whether it is in C, in Haskell, 
in ML, .., ever since Strachey documented l-values in 1967, and 
probably longer than that. Once references become visible to 
programmers, one might as well support them fully. Eliminating
temporary structures is a common implementation optimization, 
not limited to References, and not a language spec concern.

>> Broken equivalences:
..
>>     var x = obj.m; x(); <-/-> obj.m();
>>
>>     obj.m.valueOf() <-/-> obj.m
> 
> Why should that work? The valueOf function isn't guaranteed 
> to return anything related to the object it's on.

The default valueOf for Function comes from Object,
where it is 'ToObject this', which for Object is the input
argument without conversion. I think..

> This is exactly correct. Extracting a method from its 
> object will break the connection to the object. Which is 
> kind of expected when you allow any function to be 
> used as both a method and a non-method.

I expected none of these:

- Property access is not extraction.
- Extraction is triggered by constructs that could just
    as well pass on the property accessor (and probably
    should, as the alternative leads to runtime errors).
- Extraction alone will not break the connection, only
    some forms of triggering extractions will do so.
- A new connection is established by trying to pass
    a property accessor through an Array.

>> I was not aware that just about any code transformation
>> would be invalidated by the handling of References. If 
>> this cannot be fixed, could the specification be more 
>> explicit about this, please?
> 
> When I think of References as l-values, the current behavior 
> actually become the expected one. The only tricky bit is that 
> method-calls actually need an l-value to work correctly.

Even the l-value part is unusual, as I've tried to show.

Claus

# Claus Reinke (14 years ago)

Like most Javascript programmers, I have tended to follow a simple rule for functions using 'this': eta-expand method selections, use .bind, or get into trouble.

That is unnecessary, inefficient, and adds clutter.

The problem with rules-of-thumb is that most people only have two of those;-) I agree that knowing and understanding all the relevant aspects is better, and reducing the number of details a programmer must worry about is a good language design goal. But when we can't communicate all the details, our rules have to be simple.

The rule above doesn't mention details, and doesn't discuss alternatives, such as the DOM's EventListener interface, but it does cover your suggestions: if the function doesn't use 'this', there's no need to worry, and eta-expansion of method selections (only needed when not directly applied) ensures that method calls are always qualified.

var short = function(..) { return obj.method(..); };

The problem with does-it-use-this is that it implies knowledge about the function definition. Eta-expansion certainly adds clutter (but that is generic clutter which I'd like to get rid in the general case), but it works whether the function uses 'this' or not, and there is no reason for it to be inefficient.

But it doesn't really matter which rules we follow to work around this hole, the issue is that the language spec leaves this hole for programmers to fall into.

Removing such sources of errors tends to be more useful than collecting extensive documentation about workarounds(*).

Claus

(*) Once, I used a modelling tool with extensive, carefully written and illustrated documentation. Using another tool for the same application domain that didn't need that kind of documentation was an interesting experience.

>> Like most Javascript programmers, I have tended to follow
>> a simple rule for functions using 'this': eta-expand method
>> selections, use .bind, or get into trouble.
>>
> That is unnecessary, inefficient, and adds clutter.

The problem with rules-of-thumb is that most people only
have two of those;-) I agree that knowing and understanding 
all the relevant aspects is better, and reducing the number of 
details a programmer must worry about is a good language 
design goal. But when we can't communicate all the details, 
our rules have to be simple. 

The rule above doesn't mention details, and doesn't discuss
alternatives, such as the DOM's EventListener interface, but it 
does cover your suggestions: if the function doesn't use 'this', 
there's no need to worry, and eta-expansion of method 
selections (only needed when not directly applied) ensures 
that method calls are always qualified.

    var short = function(..) { return obj.method(..); };

The problem with does-it-use-this is that it implies knowledge
about the function definition. Eta-expansion certainly adds 
clutter (but that is generic clutter which I'd like to get rid in 
the general case), but it works whether the function uses 
'this' or not, and there is no reason for it to be inefficient. 

But it doesn't really matter which rules we follow to work
around this hole, the issue is that the language spec leaves
this hole for programmers to fall into. 

Removing such sources of errors tends to be more useful 
than collecting extensive documentation about workarounds(*).

Claus

(*) Once, I used a modelling tool with extensive, carefully
    written and illustrated documentation. Using another
    tool for the same application domain that didn't _need_ 
    that kind of documentation was an interesting experience.