Nullablity

# Shijun He (19 years ago)

All class should be non-nullable! If somebody want any option type, it should be declared explicitly (var a:MyClass?). Making things nullable by default and making T non-nullable by "T!" is a big mistake!

Why we introduce nullablity? I believe it is because nullablity makes type finer and safer. We all know that many JSers don't like static type though they may use java/c# on server-side. So it's clear that if somebody want to use static type in ES4, that means he/she might like to have more strict and precise type system. For them, it is more common for it not to make sense for a type to include the null value. After I learn nullablity by Nicenice.sourceforge.net, I

found it's rare to write a nullable type. I use nullable most in the case of method declarations, which a parameter with null value for special meaning. Then I'd like to write function f(v:T?) to indicate such special usages.

Some reasons are given to make nullable default, but:

  1. ES4 is still in process, isn't it? So compatibility with existing implementations such as AS3 should not be the reason!! Never!

  2. It is said it seems to be "the way of the world". NO! It's totally mess of cause and effect. "The way of the world" is not nullable by default, but no nullablity mechanism in mainstream OO languages such as Java/C#/C++ (until C# 2.0 introduce nullablity with generic type). So it's not users expected, but being forced. If we introduce nullablity, we must do the right thing.

  3. I don't think it's too draconian, if I want static type, which means I want a strict type system. Code should be as clear as possible to indicate whether it is nullable or not! The bad example is C# 2.0 which seperate type into reference type and value type, the users have to recall whether a type is ref type or value type. C# 2.0 has no choice because it must compatible with C# 1.0, but ES4 have no such burden.

There are many issues in current proposal:

  1. It makes nullablity useless. Most developers do not know the benefits and usage of nullability, so they will ignore it at all. At least in the early days...

  2. If all non-nullable, compiler or verifier can force users write null checking branches when use nullable type ( var a:T? ... if (a==null) {...} else {...} ) to eliminate NullPointerException. It's reasonable because the users write "T?" and know it's nullable. But it's hard to force the developers write such codes if nullable by default, especially because of issue 1, there will be more nullable classes than should, All will say why i must write such boring code. Things becoming worse, should I write null checking for Complex ? it depends on the author of Complex ! It is said "there are very few special cases to remember", but what about libraries?

  3. It's inconsistent, not only Number and Boolean, but also user-defined class such as Complex. To make Complex seems like a value type, it is suggested to use class Complex! {...}. But that means authors will get burden to make the decision whether the class should be nullable. I can imagine most libraries will full of nullable class though non-nullable is more proper. Ok, they will change class Complex to class Complex! in next release, as a result, all client codes must be rewritten (change a:Complex to a:Complex? or review all codes to ensure a:Complex is ok). Of couse we all agree Complex should be non-nullable, but what about Date? Should Date be nullable or non-nullable? C# treat DateTime as value type and non-nullable. Such things (decide if it is nullable and remeber the decision) will drive lib developers and users crazy.

What I expect is straightforward. When I say a:T, I mean non-nullable, when I say a:T?, I mean nullable. When I see a:T, I know it's non-nullable, when I see a:T?, I know it's nullable. As a library developer, I can be free from the metaphysical issues (should this or that be nullable?). As a user, no need to guess default nullablity, no need to search documents, no need to worry about nullablity will change in the future. And there is no need to use ugly "T!". a:T!=x and a:(t!=x) and a::t!=x is too confused.

# Burak Emir (19 years ago)

Shijun He wrote:

All class should be non-nullable! If somebody want any option type, it should be declared explicitly (var a:MyClass?). Making things nullable by default and making T non-nullable by "T!" is a big mistake!

I agree with this point of view, because it makes the types system simpler and removes the need for remembering implicit assumption (type X is nullable by default, type Y is not).

It seems acceptable to mark explicitly all types that contain the null value, and to have a standard way to strip off the '?' of them. If you look at languages like ocaml, Haskell and Scala, you find a type like Option[C] that has to be pattern matched against. Suppose that type Foo has method bar, one can't call bar on Option[Foo]. It's rather:

def doit(x:Option[Foo]) = x match {//Scala syntax case Some(y) => y.bar case None => ... }

My understanding of the Nullability proposal is that for purposes of calling methods, T? and T are "the same", the former risking to throw a NullPointerException and the latter not. Still there is some piece missing, as one might need to convert T? to T in order to satisfy interfaces

fun foo(x:T) { ... }

fun baz(x:T?) { foo(x) // no, no, incompatible type }

It seems to me that rather then messing with "if(x==null) " or pattern matching or whatnot, the cleanest way to achieve the above is to use the "to" operator described in the type proposal

fun baz(x:T?) { foo(x as T) } // throws NullPointerException at runtime if x happens to be null

# John Cowan (19 years ago)

Shijun He scripsit:

What I expect is straightforward. When I say a:T, I mean non-nullable, when I say a:T?, I mean nullable. When I see a:T, I know it's non-nullable, when I see a:T?, I know it's nullable.

+1

# Brendan Eich (19 years ago)

I'm sympathetic to this "don't make implicit option types by making
class types nullable" argument -- I made it in TG1 meetings a while
ago. But before that, waldemar's drafts and the derived JScript.NET
and ActionScript languages made class types nullable by fiat. This
was not just a bad precedent. Nullability for class type has two
independent arguments:

  1. Mindshare from Java, C#, and other languages that include null
    among the values of reference types.

  2. The difficulty of initializing variables of non-nullable type
    with a sound default value.

It would be good to hear new counter-arguments.

# John Cowan (19 years ago)

Brendan Eich scripsit:

  1. Mindshare from Java, C#, and other languages that include null
    among the values of reference types.

This argument is strong but not overriding.

  1. The difficulty of initializing variables of non-nullable type
    with a sound default value.

At the syntax level this can be solved by not allowing such variable declarations. At the semantic level it's much deeper, and may justify some language support for the Null Object pattern. Statically typed functional languages typically have a distinct variety of null for each nullable type, and I think rightly so.

# Brendan Eich (19 years ago)

On Jun 21, 2006, at 12:27 PM, John Cowan wrote:

Brendan Eich scripsit:

  1. Mindshare from Java, C#, and other languages that include null among the values of reference types.

This argument is strong but not overriding.

Agreed.

  1. The difficulty of initializing variables of non-nullable type with a sound default value.

At the syntax level this can be solved by not allowing such variable declarations. At the semantic level it's much deeper, and may justify some language support for the Null Object pattern. Statically typed functional languages typically have a distinct variety of null for each nullable type, and I think rightly so.

Agreed, but ES4 is not a statically typed language by default. The
type checker used in the strict mode is optional to programmers as
well as to implementors; we're writing a normative specification for
it to ensure interoperation.

# Burak Emir (19 years ago)

I guess this one was meant to go the list as well

# Burak Emir (19 years ago)

Brendan Eich wrote:

I'm sympathetic to this "don't make implicit option types by making
class types nullable" argument -- I made it in TG1 meetings a while
ago. But before that, waldemar's drafts and the derived JScript.NET
and ActionScript languages made class types nullable by fiat. This
was not just a bad precedent. Nullability for class type has two
independent arguments:

  1. Mindshare from Java, C#, and other languages that include null
    among the values of reference types.

They (erm, we) can use annotations T? and get used to it.

  1. The difficulty of initializing variables of non-nullable type
    with a sound default value.

It would be good to hear new counter-arguments.

ok, I saw that discussion page, I see it's not an easy decision. It's also not easy to present hard arguments.

Having to write "proper" initialization for non-nullable types is a price that I am willing to pay. When trapped in my old ways Java/C# ways, I vow to write all those additional '?' reminding myself that there might be a null lurking somewhere.

Isn't it really about shifting the syntactic burden? The discussion on default values is interesting, but does not seem too decisive in the question of having (T? and T) vs having (T and T!)

my 2c,

# Nicolas Cannasse (19 years ago)

Isn't it really about shifting the syntactic burden? The discussion on default values is interesting, but does not seem too decisive in the question of having (T? and T) vs having (T and T!)

What is important is "what is the default" because of course most of the users will go for the default. It makes a lot of sense to have not-nullable values by default and helps a lot when developing and documenting libraries. It's of course a big plus for security like it has been discussed before.

As for values initialization, it should be enforced by the compiler. This can easily be done for member and local variables by using some recursive flow algorithm.

For static variables, I'm not fond of the runtime error and I would prefer to force users to initialize their statics when declaring them, or either use a nullable type.

# Graydon Hoare (19 years ago)

Nicolas Cannasse wrote:

As for values initialization, it should be enforced by the compiler. This can easily be done for member and local variables by using some recursive flow algorithm.

I know you're coming from the background of compiling large, highly structured programs, and I respect that perspective. I personally agree with what you've said here, in the context of a language for my programming needs.

But I've also heard a credible argument during the design process: that the majority of JS code on the web is tiny little fragments, one liners, often fragments that only run zero or one times, and sometimes running on a cell phone. Keeping the "defaults" easy to compile is a priority for at least some of the language implementers. This means dataflow algorithms are unappealing: local expression-type inference is about all we want to pay for. We do not want to reimplement Java's definite assignment rules or similar.

We've talked through many alternatives, and I agree that the topic is not yet exhausted. Some syntactically distinct form that covers for the same task is possible. We've discussed C++-style initializer syntax for example, or ML-style construction-from-a-set-of-expressions. I'm sure there are other potential options, but we must keep the tradeoffs in mind.

# Nicolas Cannasse (19 years ago)

But I've also heard a credible argument during the design process: that the majority of JS code on the web is tiny little fragments, one liners, often fragments that only run zero or one times, and sometimes running on a cell phone. Keeping the "defaults" easy to compile is a priority for at least some of the language implementers. This means dataflow algorithms are unappealing: local expression-type inference is about all we want to pay for. We do not want to reimplement Java's definite assignment rules or similar.

Definite assignement would surely be the best way to ensure that a "T" (or "T!") variable is correctly assigned. The two other ways to ensure type-safety of these variables are :

a) forbid "var" declaration without an initialization expression. This is a form of very strict definite assignement since no delay is even permited

b) have a runtime error. That's not what I would call "type-safety" since it's just a kind of Null Pointer Exception just a bit before the pointer is actually accessed.

From what I've seen, ES4 will not target the user that just want to open a popup on his website or do some basic form validation. But when you start using the type system, you should be able to rely on it.

Since you're not going the type-inference way, accessing strictly-typed API with untyped code should not cause any problem, unless the user pass an explicit "null" value where an not-nullable instance is required.

Or maybe I forgot something ?

# P T Withington (19 years ago)

From: Brendan Eich <brendan at mozilla.org> Date: 21 June 2006 12:10:58 EDT

I'm sympathetic to this "don't make implicit option types by making
class types nullable" argument -- I made it in TG1 meetings a while
ago. But before that, waldemar's drafts and the derived
JScript.NET and ActionScript languages made class types nullable by
fiat. This was not just a bad precedent. Nullability for class
type has two independent arguments:

  1. Mindshare from Java, C#, and other languages that include null
    among the values of reference types.

Strongly agree with Shijun He that this is a canard. These languages
include null in their reference types not because the programmer
needs it but because the language designer/compiler writer was lazy.
If we are trying to move the language forward, we should take this
chance to bite the bullet and correct this error.

[Anecdote: Curl initially permitted null in its reference types, but
under pressure from me (from my experience on the Harlequin Dylan
team) changed to a scheme of having to explicitly declare when a type
was nullable. Curl had a large body of existing code where overnight
declarations of T went from meaning 'T or null' to strictly 'T'.
Very little of the code had to be changed to use the new 'T?'
declaration, and in nearly every case where the new declaration was
required it was realized that there was previously a potential
runtime error due to not checking for null.]

  1. The difficulty of initializing variables of non-nullable type
    with a sound default value.

It would be good to hear new counter-arguments.

When it's too difficult, you use T? and don't initialize it. You can
also permit:

var t:T; ... t = new T;

so long as t is not used in .... In compiled code, the compiler
may be able to omit any runtime check if it can prove that through
flow analysis, otherwise it needs to runtime check usages of t.

I strongly support having types not include null and requiring
explicit declaration of nullability. [Dylan and Curl are two
examples of optionally-typed dynamic languages where such a scheme
has been successfully implemented.]

# Brendan Eich (19 years ago)

Remember, I am in favor of nullability being explicit, "opt in". I
also don't think ActionScript 3 or JScript.NET compatibility ties
ECMA TG1's hands, so let's get that out of the way. Adobe and MS
people speak up if you feel differently.

On Jun 22, 2006, at 5:36 PM, P T Withington wrote:

[Anecdote: Curl initially permitted null in its reference types,
but under pressure from me (from my experience on the Harlequin
Dylan team) changed to a scheme of having to explicitly declare
when a type was nullable. Curl had a large body of existing code
where overnight declarations of T went from meaning 'T or null' to
strictly 'T'.

Curl requires initialization for let bindings? (It's hard to tell
from the online docs I've read.)

Very little of the code had to be changed to use the new 'T?'
declaration, and in nearly every case where the new declaration was
required it was realized that there was previously a potential
runtime error due to not checking for null.]

This, I believe -- we have spidered the web looking for uses of JS
typeof that assume typeof x == "object" means x is not null. Lots of
people assume "object" type does not include null.

I strongly support having types not include null and requiring
explicit declaration of nullability. [Dylan and Curl are two
examples of optionally-typed dynamic languages where such a scheme
has been successfully implemented.]

I hear you.

# P T Withington (19 years ago)

On 2006-06-22, at 21:15 EDT, Brendan Eich wrote:

Curl requires initialization for let bindings? (It's hard to tell
from the online docs I've read.)

It is pretty hard to find anything about Curl on the web.

I may be mis-remembering but I don't think it requires let bindings
to be initialized. You can declare them to have a nullable type or a
type of any, in which case I believe they default to null.

Curl has two helpful conditional expressions if-not-null and type- switch, in the body of these expressions the compiler infers that
variables are of the appropriate narrower type (avoiding the Java
idiom if (x instanceof Y) { Y y = (Y)x; ...}).

# Shijun He (19 years ago)

---------- Forwarded message ---------- From: Shijun He <hax.sfo at gmail.com>

Date: Jun 23, 2006 1:34 PM Subject: Re: Nullablity To: Graydon Hoare <graydon at mozilla.com>

On 6/23/06, Graydon Hoare <graydon at mozilla.com> wrote:

Nicolas Cannasse wrote:

As for values initialization, it should be enforced by the compiler. This can easily be done for member and local variables by using some recursive flow algorithm.

I know you're coming from the background of compiling large, highly structured programs, and I respect that perspective. I personally agree with what you've said here, in the context of a language for my programming needs.

But I've also heard a credible argument during the design process: that the majority of JS code on the web is tiny little fragments, one liners, often fragments that only run zero or one times, and sometimes running on a cell phone. Keeping the "defaults" easy to compile is a priority for at least some of the language implementers. This means dataflow algorithms are unappealing: local expression-type inference is about all we want to pay for. We do not want to reimplement Java's definite assignment rules or similar.

There r 2 modes: strict and non-strict(what named 'standard' now, but i don't like this name, maybe 'loose' will be better). So, if i want to write a tiny code fragment, i can use non-strict mode or even back to ES3.

Strict mode should help coder to eliminiate all NullPointerException.

About non-strict mode, any access to uninitializd non-nullable var will cause UninitializedError.

I think there is no need to introduce C++ initializer. In strict mode, compiler could ensure non-nullable member will be initialize in constructor, in non-strict mode, nothing happend until uninitializd non-nullable property being accessed, result in UninitializedError.

And I suggest that we can introduce default function for default value of non-nullable type.

class Complex { static const zero:Complex = new Complex(0, 0); function default() { return zero; } function Complex(a, b) {...} ... }

var c:Complex; assert (c == new Complex(0, 0));

Here a another simpler syntax i suggested:

class Complex { static const zero = new(0, 0); // for constant, there is no need to declare type default() { return zero; } // function keyword can be omitted in class definition new(a, b) {...} // use 'new' instead of the class identifier, and we could allow such usage: c = Complex.new(0, 0); ... }

# Jeff Dyer (19 years ago)

From: Brendan Eich

Remember, I am in favor of nullability being explicit, "opt in". I also don't think ActionScript 3 or JScript.NET compatibility ties ECMA TG1's hands, so let's get that out of the way.

Agreed. It doesn't tie our hands, but breaking compatibility with an existing implementation has a short term cost that needs to be weighed against the (mostly long-term) benefit of improving the language. Until we have a specific proposal on the table it is hard to predict the cost of such a change. Clearly a proposal that allows AS3 code that is free of null pointer errors to compile and run unchanged would be much preferred to proposals that don't.

# P T Withington (19 years ago)

On 2006-06-23, at 13:14 EDT, Jeff Dyer wrote:

Clearly a proposal that allows AS3 code that is free of null pointer errors to compile and run unchanged would be much preferred to proposals that don't.

Since I'm jumping in to the middle of this process, let me summarize
my understanding. Tell me if I'm wrong:

The proposal on the table is to make all types not nullable by
default. If a program has no null pointer errors, that should not
affect the program. There is an implementation detail for compiler
writers that it is not an error to have an uninitialized variable
so long as it is always initialized to a type-correct value before it
is read (requires flow analysis in the compiler). It would be easier
for the compiler implementor to not permit that case in the language,
but I don't think that is being proposed -- such a change would
break correct programs.

# Brendan Eich (19 years ago)

On Jun 24, 2006, at 9:33 AM, P T Withington wrote:

On 2006-06-23, at 13:14 EDT, Jeff Dyer wrote:

Clearly a proposal that allows AS3 code that is free of null pointer errors to compile and run unchanged would be much preferred to proposals that don't.

Since I'm jumping in to the middle of this process, let me
summarize my understanding. Tell me if I'm wrong:

The proposal on the table is to make all types not nullable by
default. If a program has no null pointer errors, that should not
affect the program. There is an implementation detail for compiler
writers that it is not an error to have an uninitialized variable
so long as it is always initialized to a type-correct value before
it is read (requires flow analysis in the compiler). It would be
easier for the compiler implementor to not permit that case in the
language, but I don't think that is being proposed -- such a change
would break correct programs.

For better or worse, that is what is proposed the last I heard (I'm
still on vacation, so I may be out of date).

Requiring definite assignment flow analysis is out, because of the
desire for small/simple implementations that do include the
optional but normatively-specified type checker.

Portable code should be able not to initialize a non-nullable
variable as you say, so long as well-typed sets of that variable
dominate all uses of the variable.

But AS3 and JScript.NET make Object, e.g., nullable, and people do
write variable declarations (especially in classes) that are not
initialized, not even obviously initialized in constructors before
uses of those members.

What's more, some AS3 code is not incorrect (subject to NPEs) just
because it mixes null into the value set for Object or other class- type-annotated variables. A use-case I've heard of from the Flex SDK
(ActionScript 3) is a non-rest (not trailing) parameter where passing
null means "ignore this formal parameter". With the proposed change
to make types non-nullable by default, any call to such a function
that passes null would fail to type-check.

Therefore, the only coherent paths forward are:

  1. Keep compatibility, making non-nullability optional and explicit.

  2. Break compatibility in ES4 with AS3 and JScript.NET, by making
    nullability optional and explicit.

  3. Break compatibility by making nullability optional and explicit,
    but require enough flow analysis to avoid calling existing patterns:

(a) Uninitialized declarations annotated with default-non-nullable
types. (b) A null actual parameter passed to a function taking a non- nullable formal parameter, where the AS3- or JScript.NET-compatible
compiler has done extensive flow analysis and marked the parameter as
implicitly nullable because it can prove that all dereferences of the
formal are guarded by not-null tests. (c) Variations on (b) involving other kinds of data flow than actual
to formal parameter flow.

I singled out (b) because it seems solvable by backward-compatible
compilers when generating code for the called function. (c) may
require global analysis including alias analysis (e.g., via package
imports) that is quite expensive compared to the analysis needed for
(b).

Jeff, is 3 too big a burden? You may already have such flow analysis
under way.