Any blogs to help orient me re the reference implementation? contrasting with F#

# ToolmakerSteve98 (18 years ago)

What I know so far is that it is written in ML. And it implements the current state of the [spec that I can't see much info on, as mentioned in previous post].

Wondering if there is any info beyond the source itself that would give me a "big picture" view of what is there, before I start wandering around in the depths of the code.

I'm especially interested in the parser. Maybe will be obvious once I start looking, but might as well learn what I can the easy way. Bottom-up? top-down? grammar notation used? Based on some pre-existing parser combinator library?

My immediate goal is to attempt to translate from the preliminary ES4 into F#, using an extended-PEG metacompiler similar to Rats!. I want to get a grasp on precisely where this dynamic language deviates in semantics from a statically typed language using type inference. Well, that will depend on test sources fed in, but attempting this translation will tell me a lot.

~TMSteve

# Graydon Hoare (18 years ago)

ToolmakerSteve98 wrote:

Wondering if there is any info beyond the source itself that would give me a "big picture" view of what is there, before I start wandering around in the depths of the code.

Not a lot, though I've posted some notes to this list in the past; I'm happy to discuss it further. It wasn't clear after the last round of discussion on this topic whether new readers would be better served by "abstract" notes on implementing the language -- implementation-language agnostic -- rather than an explicit tour guide for the ML implementation. But I can provide pointers on the latter if you like.

I'm especially interested in the parser. Maybe will be obvious once I start looking, but might as well learn what I can the easy way. Bottom-up? top-down? grammar notation used? Based on some pre-existing parser combinator library?

It's a hand-rolled unicode lexer that produces token lists, and a top-down parser over those token lists, with variable lookahead and a minor bit of feedback from the parser to the lexer (for mode switching on things like /, that contextually means either "leading edge of regular-expression" or "lone division operator"). No grammar formalism is used in the code, though it's quite regular in structure. It uses ML data pattern matching on the current list head, for the most part.

There is a companion excel spreadsheet kept in the source repository that documents the grammar in BNF. There is no mechanical connection between that spreadsheet and the ML code. They are kept in sync by hand. Jeff Dyer wrote much of the parser.

My immediate goal is to attempt to translate from the preliminary ES4 into F#, using an extended-PEG metacompiler similar to Rats!. I want to get a grasp on precisely where this dynamic language deviates in semantics from a statically typed language using type inference. Well, that will depend on test sources fed in, but attempting this translation will tell me a lot.

Good luck! I am not sure how far the similarity will carry, but it sounds like an interesting experiment.

# Jeff Dyer (18 years ago)

On 3/19/08 2:40 PM, ToolmakerSteve98 wrote:

I'm especially interested in the parser. Maybe will be obvious once I start looking, but might as well learn what I can the easy way. Bottom-up? top-down? grammar notation used? Based on some pre-existing parser combinator library?

Take a look at the grammar Graydon just mentioned at:

www.ecmascript.org/es4/spec/grammar.pdf

You'll see a pretty obvious mapping to a top-down predictive parser in two implementations:

1/The ES4-RI in SML 2/tamarin-central front-end in ES4 (hg.mozilla.org/tamarin-central/?file/fbd209c1fe58/esc/src/parse.es)

That should get you started.

# Michael Daumling (18 years ago)

Are the XML elements in that grammar just an oversight, or is this a placeholder for future implementations of ECMA-357? E4X is not supposed to be part of ES4.

Michael

# Lars Hansen (18 years ago)

We have set aside placeholders for E4X syntax. How useful this is I don't know; the experience with reserving future reserved words in ES3 has been mostly negative (as a rule programmers don't read specs, and when they do, as a rule they ignore the "may be used in the future" clauses -- and I don't think they're wrong in doing so).

--lars

-----Original Message----- From: es4-discuss-bounces at mozilla.org [mailto:es4-discuss-bounces at mozilla.org] On Behalf Of Michael Daumling Sent: 20. mars 2008 02:18 To: es4-discuss at mozilla.org Subject: RE: Any blogs to help orient me re the referenceimplementation?contrasting with F#

Are the XML elements in that grammar just an oversight, or is this a placeholder for future implementations of ECMA-357? E4X is not supposed to be part of ES4.

Michael

-----Original Message----- From: es4-discuss-bounces at mozilla.org [mailto:es4-discuss-bounces at mozilla.org] On Behalf Of Jeff Dyer Sent: Wednesday, March 19, 2008 4:01 PM To: ToolmakerSteve98; es4-discuss at mozilla.org Subject: Re: Any blogs to help orient me re the reference implementation?contrasting with F#

On 3/19/08 2:40 PM, ToolmakerSteve98 wrote:

I'm especially interested in the parser. Maybe will be obvious once I start looking, but might as well learn what I can the easy way. Bottom-up? top-down? grammar notation used? Based on some pre-existing parser combinator library?

Take a look at the grammar Graydon just mentioned at:

www.ecmascript.org/es4/spec/grammar.pdf

You'll see a pretty obvious mapping to a top-down predictive parser in two implementations:

1/The ES4-RI in SML 2/tamarin-central front-end in ES4 (hg.mozilla.org/tamarin-central/?file/fbd209c1fe58/esc

src/parse.

# Brendan Eich (18 years ago)

On Mar 20, 2008, at 8:54 AM, Lars Hansen wrote:

We have set aside placeholders for E4X syntax. How useful this is I don't know; the experience with reserving future reserved words in ES3 has been mostly negative (as a rule programmers don't read specs, and when they do, as a rule they ignore the "may be used in the future" clauses -- and I don't think they're wrong in doing so).

Generally I agree. Just for the record, the "future reserved words"
go back to ES1 and were prefigured by Netscape's reserving all then- reserved Java identifiers. This was done with agreement and a great
deal of spec-writing leadership by Microsoft, but the JScript engine
nevertheless reserved only class, enum, extends, and super (if memory
serves).

Over time, as Netscape went into decline, content grew to use
identifiers such as 'char'. And of course we are contextually
unreserving in ES4 (Firefox 2 / JS1.7 already does this), but it
wouldn't help the 'char' case I recall, where the identifier was a
parameter name.

This tale cautions about several things other than trying to reserve
future syntax, among them the participants in the standard not
following through early in their own products, before conflicts in
the market could emerge.