ES4 draft meta-issues
On Feb 27, 2008, at 9:00 AM, Lars Hansen wrote:
Meta-level methods
The predefined namespace "meta" is used for methods that
participate in language-level protocols: invocation and property access and
update. A class that defines meta::invoke is callable as a function (the meta::invoke method is invoked in response to the call); the
meta::get, meta::set, meta::has, and meta::delete methods are invoked in response to accesses to non-fixture properties on the object.
Pedantry alert, forgive me -- but it may be important to know that
meta::invoke has static and instance forms.
Given class C { ... meta static function invoke(...) ... }, you can
call C as a function:
x = y + C(z);
This is used, e.g., by class Date in builtins/Date.es.
If you define a non-static function (a method) named meta::invoke
(via class C { ... meta function invoke(...) ... }), then as with
meta::get, etc., it is the instances of C that are themselves callable:
c = new C; x = y + c(z);
So there's a meta function invoke(...) ... in class Function in the
RI's builtins/Function.es, for example.
HTH,
For some drafts coming this week, the following information will also be useful.
The specification makes use of a predefined namespace "magic". This namespace is reserved in the specification but not in any actual implementation of the language. It is used only to tag top-level functions that are implementation hooks. The hooks provide functionality that is not available in the language, for example, accessing the internal [[prototype]] property of objects.
Magic functions are defined by prose for the moment; it is probablye that they will be (partly?) exposed as SML fragments later, in the style of the semantic functions we're planning for other parts of the spec.
The specification also makes use of a type EnumerableId, which is a union type that currently looks like this:
type EnumerableId = (int | uint | string | Name)
Information/discussion item.
In the drafts for predefined classes I've sent out so far, the interaction between the intrinsic methods and the prototype methods has more or less uniformly been specified as the prototype calling the corresponding intrinsic method on its "this" object, for example:
class C {
prototype function toString(this:C)
this.intrinsic::toString()
intrinsic function toString()
...
}
The thinking was that by using this structure the prototype method can then take advantage of subclasses that override the intrinsic method (since the intrinsic is virtual the prototype method picks up the method in the subclass when calling the intrinsic). However, this thinking is flawed. The meaning of many prototype methods is fixed by E262-3 and must remain unchanged in E262-4 for compatibility reasons. An important example of this is the original Object.prototype.toString method. Not infrequently one sees code like this:
var x = <some object of unknown class, call it "C"> x.toString = Object.prototype.toString x.toString() // expected to return "[object C]"
It's a hack but it can be used to discover class names. However, if C overrides intrinsic::toString then this idiom no longer works, because Object.prototype.toString calls this.intrinsic::toString which is the overridden method in C, not the one in Object.
As a consequence, I think that the libraries need to be adjusted a little bit. Prototype methods should have fixed meanings that depend only on the type of object they were extracted from, whereas intrinsic methods can be overridden and can be specified such that they will pick up overridden methods. The structure would now be:
class C {
prototype function toString(this:C)
this.private::toString()
intrinsic function toString()
private::toString()
private function toString()
...
}
With this structure, the idiom outlined above would return the expected string, but (new C).intrinsic::toString() would pick up C's toString method as expected. In the absence of overriding, the prototype and intrinsic methods would work identically, as expected.
There are variations on this pattern; if the prototype method is generic, then it might forward to a static helper method (since an instance method would be this-constrained).
The complexity increase of this is annoying but less bad than it might seem at first glance because the truly huge classes -- Array, String -- already have all the functionality factored as static helper methods, and the main adjustment is to the prototype methods. For specification purposes it will make sense to follow the pattern pretty strictly, but in a practical implementation it would be more reasonable to duplicate some functionality, especially for simple methods.
Over the next several weeks I'll be sending out draft specs for all(?) the ES4 library classes, one class at a time (in the order I get to them).
The ES4 library is expressed in terms of ES4 fragments: the spec uses executable -- and tested -- ES4 code in places where the ES3 spec uses pseudocode. As a consequence, the draft library spec makes some assumptions about what ES4 will look like when it's finished.
Below I am going to outline some aspects of ES4 that it will be useful for the readers of the draft specs to know, beyond what's in ES3. This outline will be updated from time to time as new draft specs require it. (Probably much of the information here is already written up in the language overview available on ecmascript.org, so go there for the full story.)
Namespaces, names
ES4 puts all names into namespaces. A name is in exactly one namespace and it is placed in that namespace by prefixing the binding keyword for the name (class, var, const, function, and others) with the namespace name. If MyNS is a namespace then
MyNS var x
creates a variable whose fully qualified name is "MyNS::x".
There are several predefined namespaces. The namespace "ES4" is used for all top-level names that are new to ES4 if they're not in one of the other namespaces (except for the name "ES4" itself, which is the only unqualified top-level name introduced by ES4). Important predefined namespaces are "ES4::intrinsic" and "ES4::reflect".
In order to avoid having to fully qualify names all the time, namespaces can be opened; the names defined in the namespace will then be available without qualificiation. The namespace "ES4" is opened for all ES4 code, so in practice the two predefined namespaces listed above are known just as "intrinsic" and "reflect". (Opening a namespace may introduce ambiguities, which can be resolved by fully qualifying ambiguous names. Ambiguities are not common because a namespace opened in an inner lexical scope takes precedence over namespaces opened in outer scopes.)
The intrinsic namespace is reserved; user code is not allowed to introduce new names in this namespace. The intrinsic namespace is used primarily for methods in the predefined classes. For every prototype method M there is a corresponding intrinsic method M in the class. For example, there is Array.prototype.concat and also an intrinsic::concat method on Array instances. The prototype methods are fully compatible with ES3 in the types they accept and how they convert values. The intrinsic methods normally have more tightly constrained signatures and, like all class methods, are immutable (though they can be overridden in subclasses -- that's allowed even for user code).
The intrinsic namespace provides integrity (code that calls an intrinsic method will know that it references the original method, it is not at the mercy of changes to the prototype method) and optimization opportunities (early binding to the slot that holds the method in the presence of type annotations). The specification of the predefined classes in terms of ES4 code makes use of other predefined classes and their methods, and predefined methods are careful to call intrinsic methods to invoke known behavior and to call public methods to invoke explicitly variant behavior. Normally, such invocations are always explicitly qualified in the text in order to avoid any ambiguity in the reader's mind.
Types and annotations
Bindings in ES4 are typed, and the type can be provided explicitly by following the name with a colon and the type:
var x: Array
If the type is omitted, it is "*" (read as "any"), which means it is unconstrained. If we assume just run-time type checking for the time being, then a check is performed every time a value is stored into an annotated variable: the type of the value must be a subtype of the annotated type.
Functions can be annotated too, in both their parameter and return positions. Annotations on parameters constrain how the function can be called. Annotations in the return position constrains what the function can return:
function f(x: string): RegExp { ... }
There are two classes of types, nominal types and structural types.
Nominal types are introduced by class definitions and interface definitions. Values of nominal types are created by instantiating classes (using the "new" operator). The syntax and semantics are broadly as in Java: A nominal type is equal only to itself; a value is of a class type only if it was instantiated from that type; and it is of an interface type only if it was instantiated from a class type that declares that it implements that interface. (Note that the access control keywords like private and public are actually aliases for language-provided namespaces.)
Methods on classes appear as function definitions in the class body. The class instance is in scope in the body of a method.
Structural types are record types (for example {x:int, y:int}), array types (for example [int]), tuple types (for example [int,string]), union types (for example (int|string|RegExp)), function types (for example function(int):boolean), and some special types (null and undefined). A structural type is equal to any other structural type that has the same fields with the same types (in any order), and a value is of a structural type if it has fixed (non-deletable) fields with the names and types given by the structural type. (So if Point is a class with x and y integer fields, an instance of Point is of the structural type {x:int, y:int}.) Structural types can't be recursive.
Types can be given names by type definitions:
type Num = (int|double)
Type definitions, class definitions, and interface definitions can be parameterized:
class Map.<K,V> { ... } type Box.<T> = { value: T }
Record and array types are instantiated by suffixing the literal with the type:
{ value: 7 } : Box.<int> [1,2,3] : [int]
but now we're getting esoteric so let's stop there -- this is not the language spec.
Any type is a subtype of , and Box.<T> is a subtype of Box.<>, for any
T.
One of the important aspecs of the type system is that the types provide a specification for fixtures on the objects that are of the type: in any value of type Box.<T>, the "value" property can't be removed. (Instances of structural types can always have extra non-fixture fields, as can instances of classes designated "dynamic".)
Functions
Functions can take optional arguments (they have default values) and rest arguments:
function f(x, y=0) { ... } // y is optional function f(x, ...rest) { ... }
The rest argument appears as a regular Array object holding the excess parameter values.
Function bodies that contain a simple return statement (which typically returns the result of a call to another function) are common; ES4 introduces a shorthand where the body is a brace-less expression:
function f(x, y) g(x*2, y, 0)
Informative and helper methods
The spec is normative, which means the ES4 code in the spec is normative too. In order to avoid overspecification the spec factors out non-normative sections as methods in the "informative" namespace, which are described by prose. A good example is the global hashcode function:
} case (x: String) { return informative::stringHash(string(x)) } case (x: *) { return informative::objectHash(x) } } }
Hashing on null, undefined, booleans, and numbers are normatively specified, but hashing on strings and other objects are only informatively specified.
In order to share code, the spec also factors out commonalities as methods in the "helper" namespace. A common case is where both prototype methods and intrinsic methods take a variable number of arguments, as for the concat method in Array:
(In this case the helper function is a static method on the Array class, because it accomodates the static concat method too.)
Meta-level methods
The predefined namespace "meta" is used for methods that participate in language-level protocols: invocation and property access and update. A class that defines meta::invoke is callable as a function (the meta::invoke method is invoked in response to the call); the meta::get, meta::set, meta::has, and meta::delete methods are invoked in response to accesses to non-fixture properties on the object.
Other aspecs of the language will hopefully become clear as things move along. Do ask.