tooling: HTML & ecmarkup versions of the spec

# Michael Dyck (9 years ago)

In the minutes for July 28, Rick Waldron wrote:

9 Tooling Updates

Ecmarkup (Emu)

  • [...]
  • Michael Dyck now maintaining es-spec-html, working on high-fidelity emu output
  • [...]

To clarify...

Back in mid-April, I volunteered to maintain the HTML version of the ES spec, taking over from Jason Orendorff. At the time, I knew about ecmarkup, but there was some uncertainty (at least from my vantage point) about how soon it might be used to maintain the spec. So there was seme possibility that the next version of the ES spec might be prepared in MS Word, and thus that Jason's es-spec-html converter would be used to create the HTML version. However, the likelihood of that possibility seemed to decline fairly soon thereafter. I'm not sure it's at zero yet, but it seems pretty close.

So, no, I haven't done any maintenance on es-spec-html, and I don't expect I'll do any unless there's a good chance that it'll actually be used again.

About six weeks ago, I started work on a "high fidelity" ecmarkup version of the ES6 spec. The quote from the minutes suggests that I was doing so by modifying es-spec-html to generate ecmarkup rather than HTML. I suppose I could have done it that way, but it seems more complex than necessary. Instead, I started with the HTML version of the spec, and converted it into ecmarkup ('fixing' some bugs and inconsistencies along the way). Once I had that, I wrote a script to convert it back into HTML, and iterated on the pipeline until the pre- and post- HTML differed only negligibly. I reached that point today.

If you're interested, you can see the results here: jmdyck/es-spec-emu

Note that ecmarkup & ecmarkdown have evolved somewhat since I started, and are still evolving, so things will still need some work, but I figure it's in a good enough state for people to look at.

# Allen Wirfs-Brock (9 years ago)

On Aug 7, 2015, at 9:03 PM, Michael Dyck wrote:

In the minutes for July 28, Rick Waldron wrote:

9 Tooling Updates

Ecmarkup (Emu)

  • [...]
  • Michael Dyck now maintaining es-spec-html, working on high-fidelity emu output
  • [...]

To clarify...

Back in mid-April, I volunteered to maintain the HTML version of the ES spec, taking over from Jason Orendorff. At the time, I knew about ecmarkup, but there was some uncertainty (at least from my vantage point) about how soon it might be used to maintain the spec. So there was seme possibility that the next version of the ES spec might be prepared in MS Word, and thus that Jason's es-spec-html converter would be used to create the HTML version. However, the likelihood of that possibility seemed to decline fairly soon thereafter. I'm not sure it's at zero yet, but it seems pretty close.

So, no, I haven't done any maintenance on es-spec-html, and I don't expect I'll do any unless there's a good chance that it'll actually be used again.

Note that I made a number of significant changes to "es-spec-html" converter in order to create the official release version of the ES6 spec (ecma-international.org/ecma-262/6.0). These changes brought the formatting of the HTML version much closer to alignment to the PDF version. The update dated "es-spec-html" is included in the repository tc39/ecma262-6-src

Note that in addition to making changes to the python program and the css I also added some new styles to the Word source document that the converter now explicitly keys off of. Jason had previously avoided asking for changes to the Word doc, but that really was unnecessary. Some things are simply much easier to convert if they are explicit tagged using the styles in the Word source.

There is also now a bugzilla component for ticketing rendering bugs in the HTML version of the ES6 spec: bugs.ecmascript.org/buglist.cgi?product=ECMA-262 Edition 6&component=html rendering issues&resolution=--- I encourage anybody who sees glitches in the HTML version to report the bugs there.

Whether ES2016 will be based off of the Word source will ultimately be up to the new editor. But, given the relative small set of changes/additions expected and the relatively short time available to make them I won't be surprised if that edition continues to be Word-based.

About six weeks ago, I started work on a "high fidelity" ecmarkup version of the ES6 spec. The quote from the minutes suggests that I was doing so by modifying es-spec-html to generate ecmarkup rather than HTML. I suppose I could have done it that way, but it seems more complex than necessary. Instead, I started with the HTML version of the spec, and converted it into ecmarkup ('fixing' some bugs and inconsistencies along the way). Once I had that, I wrote a script to convert it back into HTML, and iterated on the pipeline until the pre- and post- HTML differed only negligibly. I reached that point today.

Of course, what we really want is for the post-HTML to only negligibly differ from the PDF (which is the definitive version).

If you're interested, you can see the results here: jmdyck/es-spec-emu

Note that ecmarkup & ecmarkdown have evolved somewhat since I started, and are still evolving, so things will still need some work, but I figure it's in a good enough state for people to look at.

BTW, another concern is that there must be a way produce a high quality paginated printable version (ie, PDF) from the source document.

# Michael Dyck (9 years ago)

On 15-08-10 03:14 PM, Allen Wirfs-Brock wrote:

There is also now a bugzilla component for ticketing rendering bugs in the HTML version of the ES6 spec: bugs.ecmascript.org/buglist.cgi?product=ECMA-262 Edition 6&component=html rendering issues&resolution=--- I encourage anybody who sees glitches in the HTML version to report the bugs there.

While converting the HTML spec into ecmarkup, I found roughly 100 glitches (depending on what and how you count), but I'm disinclined to report them, given that: (a) like I said, I'm doubtful that es-spec-html will ever be used again, and (b) if it is, I'd supposedly be the one fixing it, so creating bugzilla entries would just be making extra work for myself.

If, instead, the next spec is generated from the ecmarkup doc, then all those glitches are already fixed.

Of course, what we really want is for the post-HTML to only negligibly differ from the PDF (which is the definitive version).

Indeed, but I think it'd be tough to come up with an automated means of usefully reporting the differences between the two. Any ideas?

I have an XML version of the ES6 spec that I derived from the official PDF, so I have plans to convert that into ecmarkup, as another check, this one independent of the docx+es-spec-html path.

# Allen Wirfs-Brock (9 years ago)

On Aug 10, 2015, at 6:03 PM, Michael Dyck wrote:

On 15-08-10 03:14 PM, Allen Wirfs-Brock wrote:

There is also now a bugzilla component for ticketing rendering bugs in the HTML version of the ES6 spec: bugs.ecmascript.org/buglist.cgi?product=ECMA-262 Edition 6&component=html rendering issues&resolution=--- I encourage anybody who sees glitches in the HTML version to report the bugs there.

While converting the HTML spec into ecmarkup, I found roughly 100 glitches (depending on what and how you count), but I'm disinclined to report them, given that:

Have you checked them against ecma-international.org/ecma-262/6.0 ? It would be useful to know how many I have already fixed and also additional things I missed.

(a) like I said, I'm doubtful that es-spec-html will ever be used again, and (b) if it is, I'd supposedly be the one fixing it, so creating bugzilla entries would just be making extra work for myself.

We can make formatting corrections to ecma-international.org/ecma-262/6.0 if there are significant deviations from the PDF. Your list sounds like it would be useful in auditing for that.

It isn't necessary to submit 100 bugzilla tickets. Any reason format of a list that can be checked against the document would be useful. If you want you can just send it to me and Brian.

If, instead, the next spec is generated from the ecmarkup doc, then all those glitches are already fixed.

Of course, what we really want is for the post-HTML to only negligibly differ from the PDF (which is the definitive version).

Indeed, but I think it'd be tough to come up with an automated means of usefully reporting the differences between the two. Any ideas?

Postscript level comparisons of renderings after pagination normalization? Doesn't sound easy...

# Michael Dyck (9 years ago)

On 15-08-11 11:47 AM, Allen Wirfs-Brock wrote:

On Aug 10, 2015, at 6:03 PM, Michael Dyck wrote:

While converting the HTML spec into ecmarkup, I found roughly 100 glitches (depending on what and how you count),

Have you checked them against ecma-international.org/ecma-262/6.0 ? It would be useful to know how many I have already fixed and also additional things I missed.

I just fetched what's currently at that URL, and it's identical to the version that I've been working with. It looks like I got that on June 29, so if you've fixed anything since then, I'm not seeing it for some reason.

but I'm disinclined to report them [...]

We can make formatting corrections to ecma-international.org/ecma-262/6.0 if there are significant deviations from the PDF.

Ah, okay. In that case, I can see the use in reporting the glitches.

It isn't necessary to submit 100 bugzilla tickets. Any reason format of a list that can be checked against the document would be useful. If you want you can just send it to me and Brian.

Right now, it's a sequence of pattern/replacement tweaks, with some comments, so it'd take a bit of work to get it into a reasonable format.

# Allen Wirfs-Brock (9 years ago)

On Aug 11, 2015, at 10:26 AM, Michael Dyck wrote:

On 15-08-11 11:47 AM, Allen Wirfs-Brock wrote:

On Aug 10, 2015, at 6:03 PM, Michael Dyck wrote:

While converting the HTML spec into ecmarkup, I found roughly 100 glitches (depending on what and how you count),

Have you checked them against ecma-international.org/ecma-262/6.0 ? It would be useful to know how many I have already fixed and also additional things I missed.

I just fetched what's currently at that URL, and it's identical to the version that I've been working with. It looks like I got that on June 29, so if you've fixed anything since then, I'm not seeing it for some reason.

No, that's still urrent. I mis-read your original post. I thought you had said that your were working against a HTML version you had generated using es-spec-html

but I'm disinclined to report them [...]

We can make formatting corrections to ecma-international.org/ecma-262/6.0 if there are significant deviations from the PDF.

Ah, okay. In that case, I can see the use in reporting the glitches.

It isn't necessary to submit 100 bugzilla tickets. Any reason format of a list that can be checked against the document would be useful. If you want you can just send it to me and Brian.

Right now, it's a sequence of pattern/replacement tweaks, with some comments, so it'd take a bit of work to get it into a reasonable format.

Anything that that leads to an approximate problem area would be helpful.