default constructors and implicit super() insertion
On Jun 22, 2007, at 2:20 PM, Peter Hall wrote:
I can't see much about this on the wiki export, and it is barely touched on here: www.mozilla.org/js/language/es4/core classes.html#superconstructor
That's way out of date, it goes back to the post-Edition 3 period
where Waldemar Horwat was leading the charge. It's not relevant to
the current work exported at developer.mozilla.org/es4 and
available as reference implementation with trac bugsystem at http://
ecmascript-lang.org/.
Sorry to say that developer.mozilla.org/es4/spec/spec.html is
also pretty stale, but there is a plan to revamp it in the short run,
yielding a skeleton with a few pieces of meat on the bones, before
the next wiki export.
I'll duck your main points and let Jeff and others answer them ;-).
I've always wondered what the point of carrying over C++'s
constructor syntax was. Doesn't Javascript only allow one
constructor? (Never been a fan of the fake multi-method constructor
kludge myself). If the constructor didn't have a different name in
each class, you wouldn't have the problems you outline. Cf.: (http://
www.opendylan.org/fragments/classes.phtml)
On Jun 22, 2007, at 3:11 PM, P T Withington wrote:
I've always wondered what the point of carrying over C++'s constructor syntax was.
Ignoring the old pre-historic ES4 spec cited by Peter, I have to
apologize again for:
developer.mozilla.org/es4/proposals/nullability.html
being out of date. The syntax we've settled on is not C++-like to
that degree. It looks like this:
class C { var x : Object! var y : int
function C(initialX : Object!, initialY : int)
: x = initialX, y = initialY
{
...
}
}
A few points:
-
The purpose of this syntax is to ensure that class vars of non- nullable type are known to be initialized, with no chance for an
uninitialized error that would be the dual of the null pointer
exception (the exception that we hope to avoid via non-nullable
types), while not requiring definite assignment or any other such
analysis. -
The = assignment operator is used as elsewhere, to allow
destructuring assignments to set non-nullable vars. To avoid
confusion with other kinds of initialization, we call these
assignment expressions in the head of the constructor "settings",
"class settings", or "constructor settings". -
The scope chain for the right-hand side of each = includes the
formal parameter names but not the new |this| object, while the scope
chain for the left-hand side includes |this| but not the formal
parameters. So the above could be simplified:
class C { var x : Object! var y : int
function C(x : Object!, y : int) : x = x, y = y
{
...
}
}
- An alternative syntax inspired by OCaml was:
class C(ax : Object!, ay : int) { var x : Object! = ax; var y : int = ay;
function C {
...
}
}
but this is too far from existing syntax and scoping (see http:// developer.mozilla.org/es4/discussion/nullability.html).
Anyway, settings are implemented in the reference implementation
available at ecmascript-lang.org -- try them out and file
bugs at the trac if need be (look for existing tickets on the same
issue first). Thanks,
Just curious...
In modern compilers, can't the compiler issue an error if the non-nullable member was (a) not initialized in the constructor, or (b) used before it was set?
I've always hated the C++-style member initialization syntax and it seems like a shame to add it if it could be avoided. Shouldn't it also be a goal to make sure that ES4 is less baroque and easier to learn than C++?
Just my 2c.
On Jun 26, 2007, at 4:43 PM, Sho Kuwamoto wrote:
Just curious...
In modern compilers, can't the compiler issue an error if the non-nullable member was (a) not initialized in the constructor, or (b) used before it was set?
See the bit I wrote about "definite assignment ... analysis". Indeed
modern compilers can do many analyses, but this is JavaScript. It has
to be easy to implement on cell phones. It cannot spend a lot of
time, even ignoring device-imposed code size constraints, in
compiling downloaded code.
I've always hated the C++-style member initialization syntax and it seems like a shame to add it if it could be avoided.
C++-style member initialization is being avoided. That is, you don't
see 'C(ax, ay) : x(ax), y(ay) {}' in the constructor. It's true that
we are using : after the parameter list of the constructor function,
but that's unambiguous (because you can't write a return type
annotation for a constructor).
So, what do you mean, if not the : as the punctuator that introduces
the settings? And why the hate? This corner case with one character
in common with C++ syntax does not make the whole case "C++-style
member initialization".
Shouldn't it also be a goal to make sure that ES4 is less baroque and easier to learn
than C++?
Remember that these "settings" are required only if you use non- nullable types for class variables that must be initialized.
Asking a question that smuggles your (unjustified) conclusion that
this particular corner case is "baroque" is no fair. Let's define
baroque carefully, without emotion ("hate") and whipping-devils (C++).
Seems like we are talking about adding another new syntax to the language, that will be alien to existing JS and AS developers, not familiar with C++.
This syntax doesn't seem to add anything either. It's equivalent to enforcing that you initialize the properties in the constructor (as opposed to doing it in other functions that could be invoked by the constructor). But you could enforce that same restriction without the new syntax. I don't know much about compilers, but is it really too expensive to check that each non-nullable property has been assigned with a value inside the constructor body?
If you make it an error to not initialize a non-nullable property inside the constructor body, you still leave it open to widen the support in future editions, to permit the initializations to be in other methods invoked by the constructor. Adopting the new syntax would still leave this door open, but would leave redundant deprecated syntax if you walked through it... Having said that, there are many imaginable scenarios where the variables are guaranteed to be initialized, but where that fact cannot be verified at compile-time. So would it even ever be worth the effort?
BTW does anyone have any thoughts on my original questions?
Peter
Peter Hall wrote:
Seems like we are talking about adding another new syntax to the language, that will be alien to existing JS and AS developers, not familiar with C++.
Sure, it's new syntax. So are many things here. The entire concept of a non-nullable type is going to be a bit new to developers who have never ventured outside AS and JS. ECMAScript is generally a hybrid language though, borrowing from a variety of languages, both functional/expression and OO/statement languages. The technique has seen good use elsewhere, though there are plenty of subtle variants to choose from. I personally like the form employed by Nice:
nice.sourceforge.net/manual.html#constructor
But we can quibble over syntax independently. Do you think nullable types are worthwhile?
This syntax doesn't seem to add anything either. It's equivalent to enforcing that you initialize the properties in the constructor (as opposed to doing it in other functions that could be invoked by the constructor). But you could enforce that same restriction without the new syntax. I don't know much about compilers, but is it really too expensive to check that each non-nullable property has been assigned with a value inside the constructor body?
The committee's feeling was that definite-assignment analysis -- what you're describing -- is too expensive, especially for lightweight compilers. It might be that any lightweight compiler is actually going to run in standard mode all the time anyways, and we might not make standard mode obliged to detect failure-to-initialize-non-nullable until ctor execution time, in which case there's no "analysis", just a runtime bit. I do not have deep feelings either way, and I'm not sure whether we decided to support settings lists before or after our notions of strict and standard modes had fully formed.
Anyone want to chime in? I suspect a couple of the people with opinions are presently or shortly on vacation...
BTW does anyone have any thoughts on my original questions?
Automatic inference of the default ctor signature and default super-call from the superclass would be fine by me. I'm not sure what other people think. It'd break AS code.
On Jun 26, 2007, at 6:32 PM, Graydon Hoare wrote:
The committee's feeling was that definite-assignment analysis -- what you're describing -- is too expensive, especially for lightweight compilers. It might be that any lightweight compiler is actually going to run in standard mode all the time anyways, and we might not make standard mode obliged to detect failure-to-initialize-non-nullable
until ctor execution time, in which case there's no "analysis", just a
runtime bit. I do not have deep feelings either way, and I'm not sure
whether we decided to support settings lists before or after our notions of
strict and standard modes had fully formed.
The bias was against definite assignment analysis being required for
strict mode, never mind good old standard mode, which can as Graydon
notes detect uninitialized non-nullable properties the hard way at
runtime, but before the ctor returns an instance whose properties
must be well initialized.
Perhaps for strict mode we could require DA analysis. If so, we could
get rid of settings as special syntax. But settings relieve a cost
not only to implementors -- they also (idiomatic and possibly hated
syntax aside) require users of non-nullable types to put
initialization front and center, not possibly obscured or missed in
the ctor's body's control flow.
Anyone want to chime in? I suspect a couple of the people with
opinions are presently or shortly on vacation...
Indeed. We'll hear back in a few weeks, if not sooner, from some
group members now on vacation.
BTW does anyone have any thoughts on my original questions?
Automatic inference of the default ctor signature and default super- call from the superclass would be fine by me. I'm not sure what other
people think. It'd break AS code.
Peter, could you remind me how super and subclass ctor parameter
lists would be related? There's no necessary relation. How would this
work, other than as a convention supported by default, requiring
overriding to avoid type or argument number errors?
Automatic inference of the default ctor signature and default super-call from the superclass would be fine by me. I'm not sure what other people think. It'd break AS code.
I don't think it would break AS code. The only places where the behaviour would be different, are places that would currently be a compile error. (Unless I missed something, which is possible at this time in the evening)
Peter
On Jun 26, 2007, at 9:36 PM, Peter Hall wrote:
Automatic inference of the default ctor signature and default
super-call from the superclass would be fine by me. I'm not sure what other
people think. It'd break AS code.I don't think it would break AS code. The only places where the behaviour would be different, are places that would currently be a compile error. (Unless I missed something, which is possible at this time in the evening)
Oh, I see:
class B { function B(a:int){...} ... } class C extends B { function C(a:int,b:string):super(a){...} ... }
class B { function B(a:string){...} ... } class C extends B { function C(a:int,b:string):super(b){...} ... }
class B { function B(a:int){...} ... } class C extends B { function C(b:string):super(42){...} ... }
So why not relieve the programmer from having to write super(a) in
the first case. Presumably the matching would be type equality for
all superclass constructor parameter types. It would be an error to
leave out an explicit super call if the subclass constructor had
fewer or differently typed parameters.
Seems safe, helps what may be a common case (I don't know -- anyone
with Flex experience have a count of classes this would help?).
On 6/26/07, Graydon Hoare <graydon at mozilla.com> wrote:
Peter Hall wrote:
This syntax doesn't seem to add anything either. It's equivalent to enforcing that you initialize the properties in the constructor (as opposed to doing it in other functions that could be invoked by the constructor). But you could enforce that same restriction without the new syntax. I don't know much about compilers, but is it really too expensive to check that each non-nullable property has been assigned with a value inside the constructor body?
The committee's feeling was that definite-assignment analysis -- what you're describing -- is too expensive, especially for lightweight compilers. It might be that any lightweight compiler is actually going to run in standard mode all the time anyways, and we might not make standard mode obliged to detect failure-to-initialize-non-nullable until ctor execution time, in which case there's no "analysis", just a runtime bit. I do not have deep feelings either way, and I'm not sure whether we decided to support settings lists before or after our notions of strict and standard modes had fully formed.
Anyone want to chime in? I suspect a couple of the people with opinions are presently or shortly on vacation...
I expect you're thinking of me and Chris.
Some of the people on this list know I used to work for Opera, and Opera cares a great deal about keeping JS viable even for the lower end of the cell phone market (ie, the largest section of it by far). There are other kinds of devices too that are similarly powered or have similar constraints on component price (mass-market electronics like set-top boxes, in-flight entertainment systems, and the proverbial refrigerator). These devices have slow CPUs, tiny caches, small and slow memories, and slow busses; we can argue about whether they're really equipped to run a full web browser but Opera's contention is that there is no reason that web technologies should not accomodate them. (The lower end of the market is getting better at a slow pace, too, so don't expect this problem to go away.) Anyhow, the consequence is that you want to make web content easy to process, for both performance reasons and reasons of code size. For JS, this means having a straightforward compiler that performs the minimum amount of analysis, incroporating quick-and-dirty code generation, and so on. (Strict mode may be beyond such an implementation.) Definite assignment analysis is probably only slightly superlinear (I'm not co-located with my compiler books, but I seem to remember that definite assignment analysis requires the computation of dominators and dominator computation is O(n log n) in the size of the flow graph), but it's another pass added to the compiler. Some compilers do on-the-fly code generation and only have one pass anyway and no notion of "flow graph" to speak of... so the cost might be significant. If we can find a compromise between usability and processability then so much the better.
A few more points:
I think Brendan has an important point in that non-nullable initializers will be clearly visible. Non-nullability is both strange and hard to use, and making it syntactically special is not obviously bad.
Speaking as someone who's written a lot of ES4 code so far (and a lot of C++ before that), the settings are very useful because they allow the constructor arguments to be named with the same names as the instance properties for which they provide initial values, idiomatically
class C { function C(x, y) : x=x, y=y {} var x, y; }
because of how the scoping is done in settings; for a lot of simple cases the idiom is powerful and clear (clear enough that adding more syntax for those simple cases probably does not pay off).
Loading time is important to most web browsers on most platforms; arguably loading time is dominated by network latency and transfer time, but let's not make content processing times any higher than necessary. I'm worried about several aspects of the language in this regard already.
Finally, ES programmers get a lot of value from ES canonically being expressed as source-level text, unlike Java or C#, say. But it is not entirely unreasonable for them to pay a little bit in return for that value, in terms of giving up some conveniences enjoyed by users of ahead-of-time compiled languages. ES is growing up and becoming more complicated, but we're trying to keep the language scripting-like because this has advantages in itself.
All that said, ES4 is challenging to lightweight implementations, and it is -- I think -- actually an open question whether a one-pass non-tree-building reasonably efficient implementation of the language is possible. I think it is, but I'm not yet sure it is. If it turns out the answer is no, we need to decide whether to remove some features from ES4 or just abandon that requirement, at which point it becomes more interesting to consider static properties that require analysis like definite assignment.
--lars, pretending to be on vacation
Lars T Hansen wrote:
I think Brendan has an important point in that non-nullable initializers will be clearly visible. Non-nullability is both strange and hard to use, and making it syntactically special is not obviously bad.
Speaking as someone who's written a lot of ES4 code so far (and a lot of C++ before that), the settings are very useful because they allow the constructor arguments to be named with the same names as the instance properties for which they provide initial values, idiomatically
class C { function C(x, y) : x=x, y=y {} var x, y; }
While I see that it can be useful, I remain concerned that this will seem confusing to novices. Why do I need that colon? What does "x=x" mean? Why did I get the wrong result when I did the following, which looks a lot like the above?
// novice error 1 class C { function C(x, y) { x=x; y=y; } var x, y; }
What happens when an initialization requires a bit of logic? Suppose I have a "title" property of a MyDocument object that is non-nullable. If the name is passed in, it is used. Otherwise, one is generated (e.g., "Untitled1").
I'm not really an expert, but in terms of overall simplicity, my preference is
a) provide a clear way to disambiguate between members and other variables in scope in all cases. (this may already exist)
b) allow non-nullable types to be initialized in the constructor in the body of the constructor, as opposed to creating new syntax
c) make it a runtime error to read an unitialized non-nullable object. also make it a runtime error to leave a constructor without having initialized each non-nullable field.
d) optionally, make it a compile-time error to exit a constructor without having initialized each non-nullable object.
// my preferred syntax: class Position { var x: Number; var y: Number; var target: Object!;
function Position(x: Number, y: Number, target: Object!)
{
this.x = x;
this.y = y;
this.target = target;
}
}
If this discussion is already closed or if I am barking up the wrong tree, please feel free to just say so. Thanks.
For JS, this means having a straightforward compiler that performs the minimum amount of analysis, incroporating quick-and-dirty code generation, and so on. (Strict mode may be beyond such an implementation.)
I was under the impression that this discussion would not apply in non-strict mode. Would it be an error to not initialize a non-nullable variable in non-strict mode?
Definite assignment analysis is probably only slightly superlinear (I'm not co-located with my compiler books, but I seem to remember that definite assignment analysis requires the computation of dominators and dominator computation is O(n log n) in the size of the flow graph), but it's another pass added to the compiler. Some compilers do on-the-fly code generation and only have one pass anyway and no notion of "flow graph" to speak of... so the cost might be significant. If we can find a compromise between usability and processability then so much the better.
I think you misunderstood my intent. I am suggesting that you could enforce that, for each non-nullable instance variable, a constructor must contain an assignment with the variable on the LHS. I believe that this is functionally the same as the proposed new syntax (and I'm pretty sure you can incorporate the test into a single pass). There are plenty of cases where assignment is guaranteed at runtime, but cannot be guaranteed at compile-time so I don't think there would even be value in any more "difficult" analysis to prove valid assignments.
Also, could you clarify the scoping of the RHS of the initializer assignments? Is the scope hoisted or a completely different scope? e.g. would the following work as expected:
class C { var delegate:D!; function C(d:D=null) : delegate = d || new D(this) { } }
class D { function D(host:Object) { } }
Further, what happens when the initialization could take several lines to create and parametrize the object? Would you be allowed to invoke other instance or static methods in that code block?
Peter
On Jun 26, 2007, at 6:58 PM, Brendan Eich wrote:
The bias was against definite assignment analysis being required for strict mode, never mind good old standard mode, which can as Graydon notes detect uninitialized non-nullable properties the hard way at runtime, but before the ctor returns an instance whose properties must be well initialized.
Just to follow up on my own point: Dave Herman reminded me that we
don't want the evil twin of the NullPointerException, to wit an
UninitializedNonNullableTypeError (or something with a shorter
name ;-), rising to haunt the new language. Consider:
use standard;
class C { private var x : Object!; function C(y) { let q = helper(y); x = {p: q}; } private function helper() { /* who knows what this does! */ } }
The constructor function for C does not know what helper does with |
this|. Could it reference x? If so, it would have to get that pesky
UninitializedError, which would seem no better than its dual, the NPE
we hope to banish from runtime once instances containing non-nullably-
typed vars are initialized.
Settings allow us to restrict not only when non-nullably-typed
variables are initialized, but how their initializers run --
specifically, without access to |this|. So any helpers would have to
be static or outer (global, package internal) functions.
This seems like a better design both for non-nullability as a usable
feature, and for simpler, smaller, faster implementations. It's true
that it requires dedicated syntax, but that is just a "UI" corner
case -- not unlike graphical UI settings that you don't often need,
but can't live without in a crunch.
[Jumping in on one point only, I'll hope to help Lars get back to
vacation so he can reply in full later :-P]
On Jun 27, 2007, at 12:00 PM, Peter Hall wrote:
For JS, this means having a straightforward compiler that performs the minimum amount of analysis, incroporating quick-and-dirty code generation, and so on. (Strict mode may be beyond such an implementation.)
I was under the impression that this discussion would not apply in non-strict mode. Would it be an error to not initialize a non-nullable variable in non-strict mode?
Yes, otherwise you have just renamed NPE to UE for no win, and then
what's the point of non-nullable types.
The intuition should be familiar to anyone who has written fast low-
level C or C++ code. Your instance has pointer members, they are
guaranteed non-null by the constructor or an initialization routine,
any failure during constructor/initialization propagates and disposes
of the half-initialized newborn. Thus all methods post initialization
need never check for null, and you never get a null pointer crash.
This is common, even required (certainly in kernel land -- I used to
be a Unix kernel hacker).
Non-nullable types should capture this use case. They ought to
eliminate stupid NPEs that I've seen (and still see) from Java and C#
web apps. Even though those languages have more fancy analysis than
we are proposing for ES4, their type systems still cannot rule out an
NPE statically or otherwise.
So we really do want non-nullably-typed class vars to be initialized
by the time the constructor finishes, and as Dave Herman reminded me,
the settings design goes further: it avoids UEs while the constructor
is active, by forcing non-nullably-typed vars to be set early, and
without reference to |this| or to other non-nullable vars.
While I see that it can be useful, I remain concerned that this will seem confusing to novices. Why do I need that colon? What does "x=x" mean? Why did I get the wrong result when I did the following, which looks a lot like the above?
// novice error 1 class C { function C(x, y) { x=x; y=y; } var x, y; }
One could argue that novices will be baffled by a great many things,
not only in ES4 (classes, non-nullable type annotations) but in ES3
(higher order functions, with statements, eval dynamic scope, etc.
etc). Novices learn, sometimes after bruising a toe or finger.
To argue briefly against your point in particular: settings are odd-
looking, distinguished syntax for several reasons, including so that
the scope for LHS and RHS can differ usefully.
To take your position, this is the only such case, and it could be
misleading, or at best a useful but odd little feature hanging on
otherwise fairly simple rules for mentally evaluating = operators
near or within class constructors.
What happens when an initialization requires a bit of logic? Suppose I have a "title" property of a MyDocument object that is non- nullable. If the name is passed in, it is used. Otherwise, one is generated (e.g., "Untitled1").
The ?: operator stands ready! Seriously, there's no problem using
arbitrary expressions, including calls to static or outer-scope
helper functions.
I'm not really an expert, but in terms of overall simplicity, my preference is
a) provide a clear way to disambiguate between members and other variables in scope in all cases. (this may already exist)
As for non-class functions (where |this| is not on the scope chain
implicitly), |this.x| works.
b) allow non-nullable types to be initialized in the constructor in the body of the constructor, as opposed to creating new syntax
Then the UninitializedError hazard arises, the dual of the NPE. True,
it is limited in extent (duration, ... not sure what the right word
is here) to the activation of the ctor. But given non-nullable type
annotation syntax, why not have syntax (remember, "UI") specialized
to avoid the UE hazard altogether?
c) make it a runtime error to read an unitialized non-nullable object. also make it a runtime error to leave a constructor without having initialized each non-nullable field.
This point (c) is really two points: make it a runtime error to read
an uninitialized non-nullable slot; and a restatement or elaboration
or implication of point (b). Taking just the first point as (c), you
are saying what must be the case no matter what, unless we do require
DA and make it an error not to initialize instance vars in
constructors. No argument there.
d) optionally, make it a compile-time error to exit a constructor without having initialized each non-nullable object.
// my preferred syntax: class Position { var x: Number; var y: Number; var target: Object!;
function Position(x: Number, y: Number, target: Object!) { this.x = x; this.y = y; this.target = target; }
}
"optionally" here could mean strict mode. It should not mean standard
mode. It should not mean "implementation defined". We want a
normative spec for both optional-at-implementation's-discretion
strict mode, and for standard mode, in order to uphold interoperation.
If this discussion is already closed or if I am barking up the wrong tree, please feel free to just say so. Thanks.
Not at all, this list exists exactly for this kind of discussion. So
thanks back!
I can't see much about this on the wiki export, and it is barely touched on here: www.mozilla.org/js/language/es4/core/classes.html#superconstructor
Two slightly related comments.
Wouldn't it be logical for a default constructor to automatically have the same signature as the super-class constructor?
Also, if a constructor is defined which has the same signature as the super constructor, and a call to super() is inserted automatically, wouldn't it make sense for super() to be passed the same arguments?
In AS3, automatic super() is always called without arguments, which can result in compile errors. Likewise, if you do not specify a constructor, the default constructor always has zero arguments. The result is that I often find myself adding constructors that do nothing at all, but can be a lot of duplicated code, if there are many arguments.
A common AS3 example is extending Error or Event, where you mostly want the constructors to be the same, for consistency, and the class may add no functionality at all:
class MyEvent extends Event { public function MyEvent(type:String, bubbles:Boolean=false, cancelable:Boolean=false) { super(type, bubbles, cancelable); } }
Peter