suggestion: mapping symbolic infix ops to binary functions

# Claus Reinke (12 years ago)

my thread starter email seems to have disappeared (see below), perhaps this reply will get through. The topic is still user-defined infix operations: in the short term, to experiment with proposed language extensions, in the long term, to shift this topic from tc39 to library authors.

I've now created a gist with an implementation sketch. It uses a modified esprima to parse infix operations, translating to Operator- applications on the fly, then uses escodegen to generate source from the AST, and optionally evaluates the result (using a little node wrapper):

map symbolic infix ops '(x .<op> y)' to 'Operation[".<op>"](x,y)'
https://gist.github.com/3157251

Missing features include precedence and associativity (in this sketch, all infix ops are left-associative between 'logical or' and 'conditional expression'; also, all infix ops start with a '.', followed by a sequence of '.<>=!+-*%&|^/~?:#'). Still, what is left is sufficient to play with

the examples, as shown in this sample source file

https://gist.github.com/3157251#file_sample.js

$ git clone git://gist.github.com/3157251.git
$ cd 3157251/
$ node infix.js sample.js --eval
hi
my name is obj
undefined
{ x: 1, y: 0 }

Hope you like it, Claus


From: "Claus Reinke" <..>

Sent: Wednesday, July 11, 2012 10:01 PM To: <es-discuss at ..>

Subject: suggestion: mapping symbolic infix ops to binary functions

# Brendan Eich (12 years ago)

We've already been looking at value types, proxies, and objects for a while. The latest, strawman:value_objects, does not expose user-defined operators, to separate concerns and limit scope, but user operators are doable.

Just for the record, not to say experiments aren't helpful: we aren't going to use mangled names starting with dot, or clump user-defined operators at left-associative low precedence. Those are anti-goals, really.

The prototype in SpiderMonkey of int64 and uint64 value objects is tracked by

bugzilla.mozilla.org/show_bug.cgi?id=749786

(I need to put up a rebased patch -- will do early this coming week).

The dispatch mechanism for dyadic operators is due to Christian Plesner Hansen, as I mentioned here:

brendaneich.com/2012/06/recent-talks-fluent-txjs-2012

Look for UPDATE:. (Still awaiting the txjs 2012 video.)

The dispatch mechanism for dyadic operators is equivalent to multimethods, but cacheable in modern JITs, and modular in the sense that double-dispatch is not. This was proposed by Christian Plesner Hansen here in 2009:

esdiscuss/2009-June/009603

From the patch for bug 749786:

When executing the '+' operator in 'A + B' where A and B refer to value
objects, do the following:

 1. Get the value of property LOP_PLUS in A, call the result P
 2. If P is not a list, throw a TypeError: no '+' operator
 3. Get the value of property ROP_PLUS in B, call the result Q
 4. If Q is not a list, throw a TypeError: no '+' operator
 5. Intersect the lists P and Q, call the result R
 6. If R is empty throw, a TypeError: no '+' operator
 7. If R has more than one element, throw a TypeError: ambiguity
 8. If R[0], call it F, is not a function, throw a TypeError
 9. Evaluate F(A, B) and return the result

Rather than use JS-observable identifiers to label operator 

handedness as in Christian's proposal ('this+', '+this'), we use SpecialId variants that cannot be named in-language or observed by proxies.

To support operator overloading in-language, we need only provide an API
similar to the one Christian proposed:

 function addPointAndNumber(a, b) {
   return new Point(a.x + b, a.y + b);
 }

 Function.defineOperator('+', addPointAndNumber, Point, Number);

 function addNumberAndPoint(a, b) {
   return new Point(a + b.x, a + b.y);
 }

 Function.defineOperator('+', addNumberAndPoint, Number, Point);

 function addPoints(a, b) {
   return new Point(a.x + b.x, a.y + b.y);
 }

 Function.defineOperator('+', addPoints, Point, Point);

I plan to prototype the Function.defineOperator API too, after first writing it up as a strawman and discussing it with TC39ers.

# Claus Reinke (12 years ago)

We've already been looking at value types, proxies, and objects for a while. The latest, strawman:value_objects, does not expose user-defined operators, to separate concerns and limit scope, but user operators are doable.

I know, but previous strawmen tended to focus on overloading the existing operators, not on user operators. At the same time, new infix operators keep getting proposed.

As far as I can tell, the issues are complementary: 1 how to define new operators 2 how to overload existing operators

Functional style wants 1, object-oriented style wants 2. But no matter which of the two is used as basis, both points are needed.

Also, the 3rd example in sample.js shows that there isn't always an object to hang the operator on, which is why I introduce a separate object for operators.

Just for the record, not to say experiments aren't helpful: we aren't going to use mangled names starting with dot, or clump user-defined operators at left-associative low precedence. Those are anti-goals, really.

Of course - hence the TODO list at the top of

https://gist.github.com/3157251#file_infix.js

I just want to gauge interest before investing more time/work - it is a sketch, not a blueprint, and I listed the missing features.

I posted the sketch because, even in its impoverished form, it allows for useful experimentation (many of the infix op proposals posted here could be prototyped, as could the overloading resolution algorithm you explained). Not having to hack any lexers/parser themselves might encourage people to experiment. Smalltalk's infix 'ops' have equal precedence, left associative, don't they?

The dot prefix was introduced to limit the impact on the lexer/ tokenizer, and it also helps to avoid getting in conflict with the many symbolic syntax proposals on es-discuss (eg, I couldn't remember whether '#' was used for one of the short function proposals or one of the object/class literal ones, or which other symbols are taken for which active/deferred/rejected strawmen). It makes user operators immediately distinguishable from language operators.

The dispatch mechanism for dyadic operators is equivalent to multimethods, but cacheable in modern JITs, and modular in the sense that double-dispatch is not. This was proposed by Christian Plesner Hansen here in 2009:

esdiscuss/2009-June/009603

From the patch for bug 749786: .. operator overloading resolution algorithm ..

Interesting, but see 1 below for an issue with this.

To support operator overloading in-language, we need only provide an API similar to the one Christian proposed: .. Function.defineOperator('+', addPointAndNumber, Point, Number);

Thanks, that is exactly the part I'm interested in:

  1. how would this work for 'obj .? prop' (select 'prop' if 'obj' is defined, return 'undefined' otherwise)?

    Function.defineOperator('.?', (obj,prop)=>obj[prop], Object, String); Function.defineOperator('.?', _=>undefined, undefined, String);

    It seems the dispatcher needs adjustment for cases like this, as there may be no left operand object on which to store the LOP_PLUS type list. Or am I missing something about value objects?

  2. how would associativity/precedence be specified?

    associativity can simply be some string, similar to the Prolog or Haskell variations (xfy/yfx or infixl/infixr)

    precedence is traditionally done by numbers (Prolog,Haskell) but absolute numbering tends to be awkward, so relative precedence ('.?' same as '.', '*' before '+') would be better

    Since the multimethod-style API spreads out operator definitions over pairs of types, one would probably need a separate API call:

    Function.declareOperator('.?',{associate:'left',precedence:'.'});

I plan to prototype the Function.defineOperator API too, after first writing it up as a strawman and discussing it with TC39ers.

Looking forward to the API strawman. It would be preferable to sort this out once and for all, before adding any more infix operators to the language spec.

Perhaps my infix.js could help to prototype API alternatives, and towards a polyfill for older JS implementations?

Claus

# Brendan Eich (12 years ago)

Claus Reinke wrote:

At the same time, new infix operators keep getting proposed.

Where?

# Claus Reinke (12 years ago)

At the same time, new infix operators keep getting proposed.

Where?

Here on es-discuss, and on the wiki?

symbolic:

<|, ??, ??=, ?., .=, .{

named:

modulo, div, divmod, has, extends

I've probably missed some - there seems to be a steady trickle of such proposals.

Claus

# Brendan Eich (12 years ago)

Claus Reinke wrote:

At the same time, new infix operators keep getting proposed.

Where?

Here on es-discuss, and on the wiki? symbolic:

<|, ??, ??=, ?., .=, .{

Ok, but <| died due to grawlix/typography and (mainly) proto. And .{ faced many objections and does not fit into any eager evaluation calling convention.

?? and ??= are short-circuiting, they do not fit into a eager evaluation calling convention.

named:

modulo, div, divmod, has, extends

These are much better as methods. Polyfillable, but also not subject to weird line terminator restrictions on the left. Same arguments killed is/isnt.

I've probably missed some - there seems to be a steady trickle of such proposals.

I think the trickle has been unsteady and it is trailing off -- and for strong reasons.

# Brendan Eich (12 years ago)

Brendan Eich wrote:

I think the trickle has been unsteady and it is trailing off -- and for strong reasons.

In contrast, the demand for user-defined operators (the usual operators, and without breaking important identities) remains strong and is poised to get stronger with binary data and value objects.

# Claus Reinke (12 years ago)

?? and ??= are short-circuiting, they do not fit into a eager evaluation calling convention.

Actually, that isn't even the worst issue. Since operators aren't first class, the desugaring could wrap arguments in (arrow) functions at the operator call sites (I've got a variant of infix.js that does this, and it can handle ??).

Unless we can limit that to non-strict argument positions, the inconvenience of forcing evaluation (exactly once) spreads to all operator definitions, though.

However, that does not help with defining assignment operators, because I don't know how to pass a Reference to or from a function.

One solution would be to inline the operator definitions at the call sites. But that would turn a simple desugaring into a non-trivial transformation. And we'd be partway towards macros..

The other solution would be a way to pass References around, but the last time I raised the issue of reifying References I was told that some engines could not easily support that, iirc.

Are macros really the only way to get around JS's peculiarities when trying for user-defined operators and control structures? And how far away are they?

Claus

# Brendan Eich (12 years ago)

Claus Reinke wrote:

One solution would be to inline the operator definitions at the call sites. But that would turn a simple desugaring into a non-trivial transformation. And we'd be partway towards macros..

No such state (partway pregnant).

The other solution would be a way to pass References around, but the last time I raised the issue of reifying References I was told that some engines could not easily support that, iirc.

No engine reifies References exactly as in the spec, to my knowledge. Too expensive.

Rather, d.items(i) = ... must parse but throw in non-legacy-IE implementations. SpiderMonkey used to support built-in methods returning (obj, id) pairs, which combined with the lvalue special forms worked at runtime with a dynamic check (for built-ins that returned only single value instead of a pair).

Anyway, as you say this is a dead end. Allen is working to remove the lvalue special forms and freedom for host objects to return References from the spec.

Are macros really the only way to get around JS's peculiarities when trying for user-defined operators and control structures? And how far away are they?

Hard to say. Let's get modules done first.

# Claus Reinke (12 years ago)

Are macros really the only way to get around JS's peculiarities when trying for user-defined operators and control structures?

Argh - I just noticed that even macros won't be sufficient.

Assuming we had function-like definitions that inlined/expanded their bodies at their call sites, textually inserting their arguments. That would be sufficient

  • for short-circuiting operators '??'
  • for simple assignment operators '='

but would not be sufficient

  • for combinations of short-circuiting and assignment '??=' lhs ??= default_value; // lhs = lhs ?? default_value

At least, I can't think of a way to avoid reevaluating the left hand side operand - if I assign it to a temporary variable, it no longer is a valid left hand side for an assignment.

So, unless I'm mistaken, we will need to deal with References at some point, or we won't be able to express in user-code what we can express in ES-spec language.

Claus

# Brendan Eich (12 years ago)

Claus Reinke wrote:

but would not be sufficient

  • for combinations of short-circuiting and assignment '??=' lhs ??= default_value; // lhs = lhs ?? default_value

No, sufficiently powerful macros can handle these too, without References.

At least, I can't think of a way to avoid reevaluating the left hand side operand - if I assign it to a temporary variable, it no longer is a valid left hand side for an assignment.

The trick is to specialize as all competitive JS engines do today to avoid references:

id op= val ~~~> id = id op val obj.id op= val ~~~> do { let $t = obj; $t.id = $t.id op val } obj[x] op= val ~~~> do { let $t = obj, $u = x; $t[$u] = $t[$u] op val }

assuming op handles its own short-circuiting (or not). Longer macro expansion using && || ?: or if-else as needed.

The macro system just needs to be able to specialize on the various legal lvalue special forms.

# Claus Reinke (12 years ago)

At least, I can't think of a way to avoid reevaluating the left hand side operand - if I assign it to a temporary variable, it no longer is a valid left hand side for an assignment.

The trick is to specialize as all competitive JS engines do today to avoid references:

id op= val ~~~> id = id op val obj.id op= val ~~~> do { let $t = obj; $t.id = $t.id op val } obj[x] op= val ~~~> do { let $t = obj, $u = x; $t[$u] = $t[$u] op val } .. The macro system just needs to be able to specialize on the various legal lvalue special forms.

Nice! Obvious, in hindsight, but somehow I never thought of that.

Thanks, Claus