barrier dimension, default dimension

# David Herman (13 years ago)

Andreas pointed out [1] that the question of defaulting to undefined vs uninitialized is orthogonal to the question of read-barrier vs read-write barrier:

Even with TDZ-RBA you can have that meaning for "let x" (and that semantics would be closest to 'var'). What TDZ-RBA gives you, then, is the possibility to also assign to x before the declaration.

My characterization unnecessarily combined these two orthogonal concerns. So we have two dimensions of TDZ alternatives:

Barrier dimension:

  • RBA: There's a read barrier but assignments are allowed even before executing the let.
  • UBI: There's a read/write barrier; neither references nor assignments are allowed before executing the let.

Default dimension:

  • UNINIT: An initializer-less let statement leaves the variable uninitialized.
  • UNDEF: An initializer-less let statement sets the variable to undefined (if it's still uninitialized).

So we already agree that TDZ-RBA-UNINIT is a non-starter, and TDZ-UBI-UNINIT is incoherent (there'd be no way to use the variable!). But that does leave this other semantics, TDZ-RBA-UNDEF.

But anyway, I think we agree that this is not a desirable semantics, so it doesn't really matter.

All last night I was chewing on this concept of an illegal assignment, and it's bugging me. How would you explain this error to a programmer?

{
    initialize();
    let x;
    function initialize() {
        x = f(); // error: x is uninitialized
    }
}

This looks an awful lot like it's saying "you can't assign to x because x hasn't been assigned to." To understand the distinction, programmers have to learn the rule that only the syntactic initializer is allowed to initialize the variable.

Put differently, a write barrier strikes me as really odd.

Andreas, can you explain why you dismiss TDZ-RBA-UNDEF as a viable option? The bug that motivates all the arguments you've made is read-before-initialization, not write-before-initialization.

Dave

[1] esdiscuss/2012-December/027507

# Brendan Eich (13 years ago)

David Herman wrote:

Andreas, can you explain why you dismiss TDZ-RBA-UNDEF as a viable option? The bug that motivates all the arguments you've made is read-before-initialization, not write-before-initialization.

Since you mailed the list I will jump in before Andreas answers. TC39 considered TDZ-RBA-UNDEF in several July meetings in adjacent years. Quoting from my followup to the July 2011 meeting notes at esdiscuss/2011-August/016188.html:

178 nor Normal Allallen at wirfs-brock.com, mail.mozilla.org/listinfo/es-discuss CONF --- Must settle scoping details for block-scoped bindings

Much discussion here. The issue is whether let and const bindings hoist to block top, or start a new implicit scope (the let* or, let's call it, C++ rule). The prior work was nicely diagrammed by Waldemar in:

esdiscuss/2008-October/007807

Quoting from Waldemar's message (note the future-proofing for guards):

--- begin quote ---

There are four ways to do this: A1. Lexical dead zone. References textually prior to a definition in the same block are an error. A2. Lexical window. References textually prior to a definition in the same block go to outer scope. B1. Temporal dead zone. References temporally prior to a definition in the same block are an error. B2. Temporal window. References temporally prior to a definition in the same block go to outer scope.

Let's take a look at an example:

let x = "outer"; function g() {return "outer"}

{ g(); function f() { ... x ... g ... g() ... } f(); var t = some_runtime_type; const x:t = "inner"; function g() { ... x ... } g(); f(); }

B2 is bad because then the x inside g would sometimes refer to "outer" and sometimes to "inner".

A1 and A2 introduce extra complexity but doesn't solve the problem. You'd need to come up with a value for x to use in the very first call to g(). Furthermore, for A2 whether the window occurred or not would also depend on whether something was a function or not; users would be surprised that x shows through the window inside f but g doesn't.

That leaves B1, which matches the semantic model (we need to avoid referencing variables before we know their types and before we know the values of constants).

--- end quote ---

In the September 2010 meeting, however, we took a wrong turn (my fault for suggesting it, but in my defense, just about everyone did prefer it -- we all dislike hoisting!) away from hoisted let and const bindings, seemingly achieving consensus for the C++ rule.

Allen, it turned out, did not agree, and he was right. Mixing non-hoisting (the C++ rule) with hoisting (function in block must hoist, for mutual recursion "letrec" use-cases and to match how function declarations at body/program level hoist) does not work. In the example above, g's use of x either refers to an outer x for the first call to g() in the block, but not the second in the block (and various for the indirect call via f()) -- dynamic scope! -- or else the uses before |const x|'s C++-style implicit scope has opened must be errors (early or not), which is indistinguishable from hoisting.

So at last week's meeting, we finally agreed to the earlier rules: all block-scoped bindings hoist to top of block, with a temporal dead zone for use of let and const before iniitalization.

The initialization point is also important. Some folks wondered if we could not preserve var's relative simplicity: var x = 42; is really var x; x = 42, and then the var hoists (this makes for insanity within 'with', which recurs with 'let' in block vs. 'var' of same name in inner block -- IIRC we agreed to make such vars that hoist past same-named let bindings be early errors).

With var, the initialization is just an assignment expression. A name use before that assignment expression has been evaluated results in the default undefined value of the var, assuming it was fresh. There is no read and write barrier requirement, as there is (in general, due to closures) for the temporal dead zone semantics.

But if we try to treat let like var, then let and const diverge. We cannot treat const like var and allow any assignment as "initialization", and we must forbid assignments to const bindings -- only the mandatory initializer in the declaration can initialize. Trying to allow the "first assignment to a hoisted const" to win quickly leads to two or more values for a single const binding:

{ x = 12; if (y) return x; const x = 3; ... }

The situation with let is constrained even ignoring const. Suppose we treat let like var, but hoisted to block top instead of body/program top, with use before set reading undefined, or in an alternative model that differs from var per temporal dead zone, throwing. So:

{ print(x); x = 12; let x; }

would result in either print being called with undefined or an error on the use of x before it was set by the assignment expression-statement -- those are the two choices given hoisting.

But then:

{ x = 12; print(x); let x; }

would result in either 12 being printed or an error being thrown assigning to x before its declaration was evaluated.

Any mixture of error with non-error (printing undefined or 12) is inconsistent. One could defend throwing in the use-before-assignment case, but it's odd. And throwing in both cases is the earlier consensus semantics of temporal dead zone with a distinct state for lack of initialization (even if the initialization is implicit, e.g., in a declaration such as let x; being evaluated). Here "initialization" is distinguished from assignment expressions targeting the binding.

Trying to be like var, printing undefined or 12, is possible but future-hostile to guards and gratuitously different from const:

{ x = 12; const G = ...; let x ::G = "hi"; }

We want to be future-proof for guards, and even more important: we want to support refactoring from let to const. Ergo, only temporal dead zone with its barriers is tenable.

There remains an open issue: without closures obscuring analysis, it is easy to declare use before initialization within the direct expression-statement children of a given block to be early errors, rather than runtime errors:

{ x = 12; // can be early error print(x); // can be early error function f() { return x; // may or may not be error } escape(f); // did this call f? let x = 42; escape2(f); // did this call f? }

Some on TC39 favor normative specification of early errors for the easily-decided cases. Others want runtime-only error checking all around and point out how even the easy cases (within straight-line code in the block's direct expression-statement children) testing that reaches the block will fail fast. The question remains: what if the block is not covered by tests?

Dave Herman brought up the let/var at top level equivalence implemented in SpiderMonkey, specifically in connection with<script> tags. Sketching in pseudo-HTML:

<script type=harmony> alert = 12; // reassign built-in alert </script>

<script type=harmony> let alert = 13; // shadow built-in alert var quux = 14; // this.quux = 14 let quux = 15; // alternative: in scope for later scripts? </script>

<script> alert(quux); </script>

Dave's point was not to commend the SpiderMonkey equating of let and var at top level, but to observe that if "let is the new var", then depending on how multiple successive script elements' contents are scoped, you may still need to use var in Harmony -- let won't be enough, if it binds only within the containing<script> element's scope.

Recall that Harmony removes the global (window in browsers) object from the scope chain, replacing it with a lexical environment with (generally) writable bindings. Each script starts with a fresh lexical environment, although it might be nested (see next paragraph).

For scripts that do not opt into Harmony, there's no issue. The global object is on the scope chain and it is used serially by successive script elements.

The question for Harmony scripts boils down to: should successive Harmony scripts nest lexical scopes in prior scripts' scopes, like matryoshka dolls? Or should each script opted into Harmony be its own module-like scope, in which case to propagate bindings to later scripts, one would have to

<script type=harmony> export let quux = 14; // available here and in later scripts </script>

This remains an open question in TC39. Some liked the explicit 'export' requirement, the implicit module scope. Others objected that migrating code would expect the nested semantics, which was not inherently evil or unsafe.

--- end of block scope discussion ---

# Brendan Eich (13 years ago)

Argh, why must mailman archive + copy/paste result in unreadably long lines. Here's the citation again (from esdiscuss/2011-August/016188):

178 nor Normal All allen at wirfs-brock.com CONF --- Must settle scoping details for block-scoped bindings

Much discussion here. The issue is whether let and const bindings hoist to block top, or start a new implicit scope (the let* or, let's call it, C++ rule). The prior work was nicely diagrammed by Waldemar in:

esdiscuss/2008-October/007807

Quoting from Waldemar's message (note the future-proofing for guards):

--- begin quote ---

There are four ways to do this: A1. Lexical dead zone. References textually prior to a definition in the same block are an error. A2. Lexical window. References textually prior to a definition in the same block go to outer scope. B1. Temporal dead zone. References temporally prior to a definition in the same block are an error. B2. Temporal window. References temporally prior to a definition in the same block go to outer scope.

Let's take a look at an example:

let x = "outer"; function g() {return "outer"}

{ g(); function f() { ... x ... g ... g() ... } f(); var t = some_runtime_type; const x:t = "inner"; function g() { ... x ... } g(); f(); }

B2 is bad because then the x inside g would sometimes refer to "outer" and sometimes to "inner".

A1 and A2 introduce extra complexity but doesn't solve the problem.
You'd need to come up with a value for x to use in the very first call to g(). Furthermore, for A2 whether the window occurred or not would also depend on whether something was a function or not; users would be surprised that x shows through the window inside f but g doesn't.

That leaves B1, which matches the semantic model (we need to avoid referencing variables before we know their types and before we know the values of constants).

--- end quote ---

In the September 2010 meeting, however, we took a wrong turn (my fault for suggesting it, but in my defense, just about everyone did prefer it -- we all dislike hoisting!) away from hoisted let and const bindings, seemingly achieving consensus for the C++ rule.

Allen, it turned out, did not agree, and he was right. Mixing non-hoisting (the C++ rule) with hoisting (function in block must hoist, for mutual recursion "letrec" use-cases and to match how function declarations at body/program level hoist) does not work. In the example above, g's use of x either refers to an outer x for the first call to g() in the block, but not the second in the block (and various for the indirect call via f()) -- dynamic scope! -- or else the uses before |const x|'s C++-style implicit scope has opened must be errors (early or not), which is indistinguishable from hoisting.

So at last week's meeting, we finally agreed to the earlier rules: all block-scoped bindings hoist to top of block, with a temporal dead zone for use of let and const before iniitalization.

The initialization point is also important. Some folks wondered if we could not preserve var's relative simplicity: var x = 42; is really var x; x = 42, and then the var hoists (this makes for insanity within 'with', which recurs with 'let' in block vs. 'var' of same name in inner block -- IIRC we agreed to make such vars that hoist past same-named let bindings be early errors).

With var, the initialization is just an assignment expression. A name use before that assignment expression has been evaluated results in the default undefined value of the var, assuming it was fresh. There is no read and write barrier requirement, as there is (in general, due to closures) for the temporal dead zone semantics.

But if we try to treat let like var, then let and const diverge. We cannot treat const like var and allow any assignment as "initialization", and we must forbid assignments to const bindings

# Brendan Eich (13 years ago)

Just to be super-clear, since citing in mail (compose and read-time) is hard: everything after "Here's the citation again (...):" is cited from the July 2011 meeting notes followup post I made in August 2011.

# David Herman (13 years ago)

On Dec 28, 2012, at 12:11 PM, Brendan Eich <brendan at mozilla.com> wrote:

David Herman wrote:

Andreas, can you explain why you dismiss TDZ-RBA-UNDEF as a viable option? The bug that motivates all the arguments you've made is read-before-initialization, not write-before-initialization.

Since you mailed the list I will jump in before Andreas answers. TC39 considered TDZ-RBA-UNDEF in several July meetings in adjacent years.

Heh, thanks for providing historical context. I think there are some flaws in the reasoning from those past meetings. I'll reply to some points from the cited minutes inline:

--- begin quote ---

But if we try to treat let like var, then let and const diverge.

I don't think there's some divergence that's okay, but they actually don't have to diverge that much.

We cannot treat const like var and allow any assignment as "initialization", and we must forbid assignments to const bindings -- only the mandatory initializer in the declaration can initialize. Trying to allow the "first assignment to a hoisted const" to win quickly leads to two or more values for a single const binding:

{ x = 12; if (y) return x; const x = 3; ... }

This is of course a silly semantics, but I argue that const should have a syntactic restriction that it can only be assigned to in its initializer -- the above should be a syntax error. Once you have that restriction, there's no observable difference between the read barrier and the read-write barrier.

The situation with let is constrained even ignoring const. Suppose we treat let like var, but hoisted to block top instead of body/program top, with use before set reading undefined, or in an alternative model that differs from var per temporal dead zone, throwing. So:

{ print(x); x = 12; let x; }

would result in either print being called with undefined or an error on the use of x before it was set by the assignment expression-statement -- those are the two choices given hoisting.

But then:

{ x = 12; print(x); let x; }

would result in either 12 being printed or an error being thrown assigning to x before its declaration was evaluated.

Any mixture of error with non-error (printing undefined or 12) is inconsistent.

This is just bogus reasoning. What consistency could we be talking about?

One could defend throwing in the use-before-assignment case, but it's odd.

What's odd about throwing in use before assignment?

And throwing in both cases is the earlier consensus semantics of temporal dead zone with a distinct state for lack of initialization (even if the initialization is implicit, e.g., in a declaration such as let x; being evaluated). Here "initialization" is distinguished from assignment expressions targeting the binding.

This is just circular: "this should be the semantics because we agreed it should be the semantics."

Trying to be like var, printing undefined or 12, is possible but future-hostile to guards and gratuitously different from const:

{ x = 12; const G = ...; let x ::G = "hi"; }

This where I think we really made a mistake. There's nothing preventing us from having additional restrictions for guarded declarations -- cross the bridge when we come to it.

We want to be future-proof for guards, and even more important: we want to support refactoring from let to const. Ergo, only temporal dead zone with its barriers is tenable.

I disagree. If you refactor from let to const it's because you only want a single assignment. It's nonsensical to refactor:

{
    let x;
    x = 1;
    x = 2;
}

to:

{
    const x;
    x = 1;
    x = 2;
}

That should just be a syntax error. Const has more restrictions. You have to play by its rules.

So I disagree with that whole line of reasoning. But I'd like to hear Andreas's reasoning why he feels there should be a write barrier.

# Brendan Eich (13 years ago)

David Herman wrote:

On Dec 28, 2012, at 12:11 PM, Brendan Eich<brendan at mozilla.com> wrote:

David Herman wrote:

Andreas, can you explain why you dismiss TDZ-RBA-UNDEF as a viable option? The bug that motivates all the arguments you've made is read-before-initialization, not write-before-initialization.

Since you mailed the list I will jump in before Andreas answers. TC39 considered TDZ-RBA-UNDEF in several July meetings in adjacent years.

Heh, thanks for providing historical context. I think there are some flaws in the reasoning from those past meetings. I'll reply to some points from the cited minutes inline:

--- begin quote ---

But if we try to treat let like var, then let and const diverge.

I don't think there's some divergence that's okay, but they actually don't have to diverge that much.

The argument (one of them) is about how much: UBI vs. RBA-UNDEF. 'const' must have UBI an error. We had consensus in July 2011 (re-checked in July 2012 if my memory serves) that 'let' and 'const' should both be subject to the same rule, modulo 'let x;' implicitly initializing with undefined.

We could diverge let as you propose, via RBA-UNDEF, but that is a greater divergence than we had in the previous consensus or quasi-consensus.

We cannot treat const like var and allow any assignment as "initialization", and we must forbid assignments to const bindings -- only the mandatory initializer in the declaration can initialize. Trying to allow the "first assignment to a hoisted const" to win quickly leads to two or more values for a single const binding:

{ x = 12; if (y) return x; const x = 3; ... }

This is of course a silly semantics, but I argue that const should have a syntactic restriction that it can only be assigned to in its initializer -- the above should be a syntax error. Once you have that restriction, there's no observable difference between the read barrier and the read-write barrier.

Yup.

The situation with let is constrained even ignoring const. Suppose we treat let like var, but hoisted to block top instead of body/program top, with use before set reading undefined, or in an alternative model that differs from var per temporal dead zone, throwing. So:

{ print(x); x = 12; let x; }

would result in either print being called with undefined or an error on the use of x before it was set by the assignment expression-statement -- those are the two choices given hoisting.

But then:

{ x = 12; print(x); let x; }

would result in either 12 being printed or an error being thrown assigning to x before its declaration was evaluated.

Any mixture of error with non-error (printing undefined or 12) is inconsistent.

This is just bogus reasoning. What consistency could we be talking about?

That both of the above should be errors to be "more consistent" in handling likely-buggy inputs. This was implicit, you're right, but I recall Sam at the whiteboard writing some of these, and the group assenting.

I'm not recounting to say we are bound by whatever happened then, rather trying to recollect exactly what consistency was being talked about there.

One could defend throwing in the use-before-assignment case, but it's odd.

What's odd about throwing in use before assignment?

I'm trying to remember. First, that "use-before-assignment case" must have been the x in print(x) here:

{ print(x); x = 12; let x; }

I remember Sam permuting the statements to get the second case, which under RBA-UNDEF would not throw, and that may be what seemed "odd".

The group as a whole seemed to want errors for both of these examples. That maximizes a kind of error-catching-over-unusual-order consistency.

And throwing in both cases is the earlier consensus semantics of temporal dead zone with a distinct state for lack of initialization (even if the initialization is implicit, e.g., in a declaration such as let x; being evaluated). Here "initialization" is distinguished from assignment expressions targeting the binding.

This is just circular: "this should be the semantics because we agreed it should be the semantics."

This was just an observation that throwing in both examples is the earlier consensus semantics for TDZ and 'let'.

Trying to be like var, printing undefined or 12, is possible but future-hostile to guards and gratuitously different from const:

{ x = 12; const G = ...; let x ::G = "hi"; }

This where I think we really made a mistake. There's nothing preventing us from having additional restrictions for guarded declarations -- cross the bridge when we come to it.

Sure, and I agree, but if you delete guards the arguments for maximal error-catching consistency and refactoring from let to const (more below) still stand.

We want to be future-proof for guards, and even more important: we want to support refactoring from let to const. Ergo, only temporal dead zone with its barriers is tenable.

I disagree. If you refactor from let to const it's because you only want a single assignment. It's nonsensical to refactor:

 {
     let x;
     x = 1;
     x = 2;
 }

to:

 {
     const x;
     x = 1;
     x = 2;
 }

That should just be a syntax error. Const has more restrictions. You have to play by its rules.

That's not the refactoring to consider. The hard case for refactoring from TDZ-RBA-UNDEF 'let' to any 'const' we can agree on is this:

{ ... let x; ... x = computed(); ... }

refactored to

{ ... const x = computed(); ... }

The three meta-ellipses in the 'let' version may interact with effects in computed(), so refactoring to the two ...s in the 'const' version is not always easy. The "refactoring from let to const" argument is about not allowing any such case with 'let' to crop up. That's all.

This may seem like small beans but it's a design choice that, all else equal, avoids a problem. Recall Crock on arguments about "oh, that's not a problem (much)...". Such little problems are still problems. Why not design this one out of ES6?

So I disagree with that whole line of reasoning.

Curious how you feel after my reply here.

But I'd like to hear Andreas's reasoning why he feels there should be a write barrier.

Me too -- cc'ing him in case he's not keeping up with es-discuss.

# Andreas Rossberg (13 years ago)

On 28 December 2012 20:30, David Herman <dherman at mozilla.com> wrote:

Andreas, can you explain why you dismiss TDZ-RBA-UNDEF as a viable option? The bug that motivates all the arguments you've made is read-before-initialization, not write-before-initialization.

I agree that that would be a less error-prone semantics, but other arguments still apply. IMO it's inferior to "TDZ-UBI-UNDEF" (the current draft semantics) for three reasons:

  1. Complexity/consistency
  2. Readability
  3. Future-proofness

Regarding (1), consider how to formulate the rules for "unhoisted" bindings. Informally, "TDZ-UBI-UNDEF" says:

  • Accessing a variable before its declaration has been executed is an error. Furthermore, "let x" is shorthand for "let x = undefined".

The corresponding text for "TDZ-RBA-UNDEF":

  • For immutable bindings, accessing a variable before its declaration has been executed is an error. For mutable bindings, read-accessing a variable before an assignment to it has been executed is an error. Furthermore, "let x = e" is shorthand for "let x; x = e". A let-declaration without a r.h.s. is a conditional assignment of "undefined" that is performed if and only if no other assignment to the declared variable has been performed before.

This is clearly less consistent and more complicated, for two reasons. First, the definition has to be different for mutable and immutable bindings (there is no such thing as an "assignment" to an immutable binding). But even ignoring immutable bindings altogether, the semantics for mutable ones alone are more complicated because of the runtime case distinction you need to make for the conditional initialization.

Regarding (2), let me repeat my mantra: reading is 10x more important than writing. Now consider reading a piece of code like this:

{ // lots of stuff let x; // more stuff print(x); }

With the current rule, all you need to read to understand what is printed for 'x' is the code between its declaration and its use. Any code before the declaration cannot possibly matter, which arguably is what one would expect intuitively (and what's the case in every other comparable language with proper lexical scoping). Not so with the TDZ-RBA-UNDEF rule, where understanding whether the variable is assigned, and how, generally requires reading all code in that block up to its use.

Regarding (3), that has been argued before often enough, so I won't repeat it here. Just let me note that future-proofness is not about crossing a bridge early, as you seemed to suggest elsewhere, it's about making sure that you haven't already burnt that bridge once you get there.