Quoted map keys (was: Re: JSON5)

# Carsten Bormann (7 years ago)

On Jul 11, 2017, at 06:14, J Decker <d3ck0r at gmail.com> wrote:

Why does JSON have quoted field names anyway (which I could understand if they included spaced).

Douglas Crockford has explained this bit of history in talks about JSON:

Originally, they weren’t quoting map keys (names of object members) if they looked like identifiers in JavaScript. (In early JavaScript-based implementations, JSON data was directly fed as code into the JavaScript interpreter so there was no need to write a decoder.)

But then some application was using “do” as a map key. “do” happens to be a reserved word in JavaScript, breaking the decoding process. So they had to check if the map keys were reserved words and quote them in that case. That set then would have to become part of the JSON specification. Worse, the set of reserved word in JavaScript could (theoretically) change, so either the JSON specification would need to change, too, or the next version of JavaScript would no longer support direct use of JSON as JavaScript code.

So they decided to simply always quote, and that was that.

The approach to execute JSON as JavaScript code of course is history now, but JSON hasn’t changed back.

Actually, with RFC 7159 and ECMA 404 out and JSON very widely implemented, any proposal to change the JSON syntax is a complete non-starter.

(Most of these proposals come from people who notice that JSON is bad for conversing about data or for human input. Well, that is not what JSON is meant for. DO NOT USE JSON FOR CONVERSING ABOUT DATA OR FOR HUMAN INPUT. JSON is an interchange format. There are much better formats for humans inputting and conversing about JSON-modeled data, such as YAML, which is even a superset of JSON. No point in messing around with JSON if the problem has already been solved.)

Of course, if saving bytes is your objective, you might want to look at CBOR. I wonder when that is picked up by the JavaScript spec (there are libraries, of course).

Grüße, Carsten

# Don Griffin (7 years ago)

​I don't think your "DO NOT USE JSON ... FOR HUMAN INPUT" notion was noticed by ... anyone really :)​

Node.js has package.json, other package managers use component.json, editors use settings.json and some even throw that file in to a text editor as the configuration UI. AFAICT this whole notion of making JSON more like JavaScript object notation is because human's do interact with JSON quite often and want some of the little creature comforts.

JSON5 may be overkill, but it seem unquoted keys and comments (and real dates) would be welcome by us unfortunate humans. :)

Best, Don

Don Griffin Sr Director of Engineering Sencha, Inc. www.sencha.com

# J Decker (7 years ago)

I didn't get that mail at all; I assume it was a mistaken private response?

On Tue, Jul 11, 2017 at 10:23 PM, Don Griffin <don at sencha.com> wrote:

​I don't think your "DO NOT USE JSON ... FOR HUMAN INPUT" notion was noticed by ... anyone really :)​

Node.js has package.json, other package managers use component.json, editors use settings.json and some even throw that file in to a text editor as the configuration UI. AFAICT this whole notion of making JSON more like JavaScript object notation is because human's do interact with JSON quite often and want some of the little creature comforts.

JSON5 may be overkill, but it seem unquoted keys and comments (and real dates) would be welcome by us unfortunate humans. :)

Best, Don

Don Griffin Sr Director of Engineering Sencha, Inc. www.sencha.com

On Mon, Jul 10, 2017 at 11:33 PM, Carsten Bormann <cabo at tzi.org> wrote:

On Jul 11, 2017, at 06:14, J Decker <d3ck0r at gmail.com> wrote:

Why does JSON have quoted field names anyway (which I could understand if they included spaced).

Douglas Crockford has explained this bit of history in talks about JSON:

Originally, they weren’t quoting map keys (names of object members) if they looked like identifiers in JavaScript. (In early JavaScript-based implementations, JSON data was directly fed as code into the JavaScript interpreter so there was no need to write a decoder.)

But then some application was using “do” as a map key. “do” happens to be a reserved word in JavaScript, breaking the decoding process. So they had to check if the map keys were reserved words and quote them in that case. That set then would have to become part of the JSON

I see. keywords in that direction.

specification. Worse, the set of reserved word in JavaScript could

(theoretically) change, so either the JSON specification would need to change, too, or the next version of JavaScript would no longer support direct use of JSON as JavaScript code.

So they decided to simply always quote, and that was that.

The approach to execute JSON as JavaScript code of course is history now, but JSON hasn’t changed back.

Actually, with RFC 7159 and ECMA 404 out and JSON very widely implemented, any proposal to change the JSON syntax is a complete non-starter.

I'm not proposing any change; I was proposing another facility JSON5 in parallel to the existing JSON. Although JSON5 being a superset of JSON and there not being any immediate issue with using JSON with JSON5, except it would be impossible to knowing whether a platform actually has JSON5 support in 'JSON' namespace. It would be much more clear if it was in a separate namespace entirely. but all of that is about parse(); on the side of stringify, if the JSON original was changed, it would start generating invalid output that existing JSON readers would not work with, sounds like a bad plan also.

Well it appears JSON5 is trying to stay at ES5 level (some sort of matching identifier there) and as such won't support `(back-tick) quoted strings.

So I don't care so much.

and I don't think tail commas in arrays don't behave correctly.

# Carsten Bormann (7 years ago)

On Jul 12, 2017, at 15:01, J Decker <d3ck0r at gmail.com> wrote:

I didn't get that mail at all; I assume it was a mistaken private response?

Most likely, this is just a result of gmail not liking mail from people at universities.

But again, if people want a JSON superset for humans, that (*) has been around for more than a decade, so I don’t know why another one has to be invented every week (except that it is easy to invent one and it is a pleasant alternative to actually working on something useful).

Grüße, Carsten

(*) YAML, in case you wondered. No, it doesn’t matter that it is a different color of bikeshed than you might have imagined; other people don’t like JSON5/HJSON/your-JSON-hack-of-the-week either.

# Michał Wadas (7 years ago)

Having native language support for YAML would be great too...

# Reinis Ivanovs (7 years ago)

YAML is not a viable alternative to JSON for these reasons and more: arp242.net/weblog/yaml_probably_not_so_great_after_all.html

TOML is what's needed, and it would also avoid a JSON/JSON5 confusion.

# Carsten Bormann (7 years ago)

On Jul 12, 2017, at 22:20, Reinis Ivanovs <dabas at untu.ms> wrote:

YAML is not a viable alternative to JSON for these reasons and more: arp242.net/weblog/yaml_probably_not_so_great_after_all.html

YAML isn’t an alternative to JSON, it is a superset of JSON that is useful for humans to use. I’m not saying it is perfect; it is out there and well-implemented, and any new effort would take a long time to be as useful. It can also do a lot more than JSON can do, things you don’t have a need for. Right now.

(I’m not going to spend time on commenting on the article you cite — lots of valid observations, mixed with a lack of knowledge how to work with these traits. I wonder what would happen if the authors of these lines would ever get hold of a JavaScript language spec :-)

TOML is what's needed, and it would also avoid a JSON/JSON5 confusion.

TOML is pretty much the only format in the general vicinity of JSON that is starting to get as widely used as YAML already has been for a decade or so. It plays well with the tastes of people who have been weaned on Windows .INI files. It also has some nice ways to flatten out overly complex structure. It is not a superset of JSON, though, so it is kind of off-topic for the current conversation.

(I wouldn’t mind getting native support for both YAML and TOML in a future JavaScript; everyone to their taste.)

Grüße, Carsten

# Isiah Meadows (7 years ago)

My one and only complaint about it is that it's still not officially stable (its version is 0.x), but that's the only reason why I'm not using it for everything.

As for putting it in JS core, I say no. TOML is not a good interchange format, and human-readable data formats (which see endless bikeshedding now) belong as libraries, not in core.

# T.J. Crowder (7 years ago)

On Fri, Jul 14, 2017 at 8:38 AM, Isiah Meadows <isiahmeadows at gmail.com>

wrote:

...As for putting it in JS core, I say no. TOML is not a good interchange format, and human-readable data formats...belong as libraries, not in core.

^^ this. Too many conflicting, usage-specific requirements and design constraints.

-- T.J. Crowder

# Reinis Ivanovs (7 years ago)

YAML is an alternative to JSON in that they're both data serialization formats. YAML lost out in this role to XML in the past, to JSON more recently, and will most likely lose to TOML as a config format (it being a special case of data serialization). YAML simply has a bloated spec with too many ways to do things and some misfeatures. It's a terse format, but at the cost of readability. Basically, there's a reason why YAML has been around for so long and has seen relatively limited adoption. Adding YAML to JS is a poor idea.

# J Decker (7 years ago)

TOML is just INI format. It doesn't even support nesting objects very well. If that was so great, it wouldn't have been replaced since the 90's. YAML with required spaces and newlines doesn't make a very good format for protocol and interchange. Neither are > JSON.

but; JSON5 is ES5 and will end up staying that way. trailing commas in arrays end up making if function diffferently that JS.

So I forked it; and made JSON6 (npm json-6); not that anyone will use it or even know about it. Probably will end up adding a date format for Date().toISOString() handling. as JS implementation it's as fast as Crockford's reference code with added features instead of half the speed like JSON5; and as a native plugin is 2x as fast as that; have to dig into JSON code in Node to see why it's soo fast (still 2.4x faster than my native code).

# Dirk Pranke (7 years ago)

On Fri, Jul 14, 2017 at 6:37 AM, J Decker <d3ck0r at gmail.com> wrote:

TOML is just INI format. It doesn't even support nesting objects very well. If that was so great, it wouldn't have been replaced since the 90's. YAML with required spaces and newlines doesn't make a very good format for protocol and interchange. Neither are > JSON.

but; JSON5 is ES5 and will end up staying that way. trailing commas in arrays end up making if function diffferently that JS.

So I forked it; and made JSON6 (npm json-6); not that anyone will use it or even know about it. Probably will end up adding a date format for Date().toISOString() handling. as JS implementation it's as fast as Crockford's reference code with added features instead of half the speed like JSON5; and as a native plugin is 2x as fast as that; have to dig into JSON code in Node to see why it's soo fast (still 2.4x faster than my native code).

What's the difference between JSON5 and JSON6, language-wise? The only thing that jumped out at me is support for empty commas ([1,,3])?

# Reinis Ivanovs (7 years ago)

TOML standardizes the INI format and adds features like nesting. The INI format is just a convention, so saying TOML would be the same thing is missing the point. The lack of standardization or features like nesting also goes to explain why XML and JSON have been more popular as config formats. XML has passed from favor due to its verbosity, and JSON was intended as a data interchange format, so its use as a config format is basically tech debt.

As mentioned, the problem with 'fixing' JSON is that it doesn't need fixing as a data interchange format, and having multiple versions of JSON will inevitably lead to confusion. Moreover, configs aren't particularly useful in contexts like browsers, so adding a config format to JS probably won't happen.

TOML keeps gaining adoption and hopefully will displace JSON's use as a config format.

# J Decker (7 years ago)

On Fri, Jul 14, 2017 at 9:39 AM, Dirk Pranke <dpranke at chromium.org> wrote:

On Fri, Jul 14, 2017 at 6:37 AM, J Decker <d3ck0r at gmail.com> wrote:

but; JSON5 is ES5 and will end up staying that way. trailing commas in arrays end up making if function diffferently that JS.

So I forked it; and made JSON6 (npm json-6); not that anyone will use it or even know about it. Probably will end up adding a date format for Date().toISOString() handling. as JS implementation it's as fast as Crockford's reference code with added features instead of half the speed like JSON5; and as a native plugin is 2x as fast as that; have to dig into JSON code in Node to see why it's soo fast (still 2.4x faster than my native code).

What's the difference between JSON5 and JSON6, language-wise? The only thing that jumped out at me is support for empty commas ([1,,3])?

Back-tick quoted strings (format template strings- without templating

features since it would be static data) which allow just direct multi-line strings. Extended that (removed one break basically) so any quoted string can contain binary \n and be multiline; added undefined keyword, . And it's 2x the speed of JSON5.

In my C version I had originally tracked down Unicode character defintions and got a list of all characters that are not ID_Start/ID_Continue to cause errors on property names; didn't implement that in javascript version, and since any property can be referenced with a quoted name, doesn't really matter if there are 'bad' identifier characters, the application just has to quote it to reference it; and it doesn't matter on the object creation side; this then just requires quotes for identifiers that have whitespace,quotes, colon, comma, '{','}','[', or ']' in them. (hmm or '/' because of comments, would entertain '#' for comments too)

After posting this I've had a lot of insights on format exploitations come to me that may or may not matter. It definitely handles well formatted json; and well formatted of its own dialect; just not sure if there's edge cases that could require better error handling. Needs to be revisited to change logging errors to throwing them.

# Mike Samuel (7 years ago)

There seem to be two separable concerns that keep coming up in this thread.

  1. Increasing the value domain that can be serialized & deserialized easily to include e.g. undefined, NaNs, Infinities, dates, timestamps.
  2. More human friendly syntax that is easily deserialized.

Both can be done in library code. A library of composable JSON replacers & revivers would extend the value domain. There are many candidates for the second but no consensus but many do have similar interface to revivers & replacers.

How do people feel about this statement. " It's premature to standardize as part of ecmascript more than JSON now since there's no consensus but both replacer/revivers libraries and human friendlier syntax would be worth considering as long as the two play nicely together and are so widely used that its hard to imagine starting a large project without using either. "

# J Decker (7 years ago)

On Sat, Jul 15, 2017 at 9:39 AM, Mike Samuel <mikesamuel at gmail.com> wrote:

There seem to be two separable concerns that keep coming up in this thread.

  1. Increasing the value domain that can be serialized & deserialized easily to include e.g. undefined, NaNs, Infinities, dates, timestamps.
  2. More human friendly syntax that is easily deserialized.

Both can be done in library code. A library of composable JSON replacers & revivers would extend the value domain. There are many candidates for the second but no consensus but many do have similar interface to revivers & replacers.

How do people feel about this statement.

then JSON should be removed from the standard if it's supposed to be external.

# Mike Samuel (7 years ago)

On Sat, Jul 15, 2017 at 5:02 PM, J Decker <d3ck0r at gmail.com> wrote:

On Sat, Jul 15, 2017 at 9:39 AM, Mike Samuel <mikesamuel at gmail.com> wrote:

There seem to be two separable concerns that keep coming up in this thread.

  1. Increasing the value domain that can be serialized & deserialized easily to include e.g. undefined, NaNs, Infinities, dates, timestamps.
  2. More human friendly syntax that is easily deserialized.

Both can be done in library code. A library of composable JSON replacers & revivers would extend the value domain. There are many candidates for the second but no consensus but many do have similar interface to revivers & replacers.

How do people feel about this statement.

then JSON should be removed from the standard if it's supposed to be external.

That was not my argument. JSON exists and provides support for serialization and deserialization. JSON was not standardized until it was well understood and widely deployed. There was a consensus that it was useful & usable. JSON 5 has not achieved that level and does not offer as much that is not already available via the existing builtins.