JSON text is not a subset of PrimaryExpression

# Mike Samuel (7 years ago)

tc39.github.io/ecma262/#sec-json.parse says

""" NOTE

Valid JSON text is a subset of the ECMAScript PrimaryExpression syntax as modified by Step 4 above. Step 2 verifies that JText conforms to that subset, and step 6 verifies that that parsing and evaluation returns a value of an appropriate type. """

IIUC, if JSON text where a subset of PrimaryExpression then there should be no string that parses via JSON.parse which does not eval when wrapped in parentheses. On recent (Chrome, Safari, Firefox),

var s = String.fromCharCode(0x22, 0x2028, 0x22)
JSON.parse(s);   // Passes
eval('(' + s + ')');  // raises SyntaxError

I believe the only reason it's not a subset is that both exclude line terminators from quoted string bodies but JSON does not treat U+2028 and U+2029 as line terminator chars while EcmaScript does.

Could we change "is a subset of" to "is almost a subset of" or "is a subset (modulo LineTerminators) of"?

Some related discussion at esdiscuss.org/topic/json

# Richard Gibson (7 years ago)

Another relevant option: Feature Request: Make ECMA262 a superset of JSON esdiscuss.org/topic/feature-request-make-ecma262-a-superset-of-json#content-8

# Mark S. Miller (7 years ago)

I argued for that in the old es3.1 days. IIRC so did Mike Samuel and Crock. The pain of doing it now would be higher than it would have been then. Nevertheless, I would still be in favor.

Please draft a proposal. Thanks.

On Jul 18, 2017 8:16 PM, "Richard Gibson" <richard.gibson at gmail.com> wrote:

Another relevant option: Feature Request: Make ECMA262 a superset of JSON esdiscuss.org/topic/feature-request-make-ecma262-a-superset-of-json#content-8

# Allen Wirfs-Brock (7 years ago)

On Jul 18, 2017, at 9:36 PM, Mark S. Miller <erights at google.com> wrote:

I argued for that in the old es3.1 days. IIRC so did Mike Samuel and Crock. The pain of doing it now would be higher than it would have been then. Nevertheless, I would still be in favor.

Why do we care? ECMAScript and JSON are two distinct languages with their own distinct and standardized definitions. And we should be well past the time where JS eval is considered to be an acceptable way to process an alleged JSON text.

The “issue” raised is one word in a non-normative paragraph. It can be corrected with a one word change. Replace

"Valid JSON text is a subset of the ECMAScript PrimaryExpression syntax as modified by Step 4 above.”

with

"Valid JSON text is a variant of the ECMAScript PrimaryExpression syntax as modified by Step 4 above.”

(note that the second sentence of this paragraph is important in that it further clarifies that the validity of a JSON text is defined by ECMA-404)

I don’t see such a trivial issue would motivate us start making normative changes to either standard when run the risk that such changes would opening the door for other proposal to “fix JSON” in some manner or another.

# Mike Samuel (7 years ago)

On Wed, Jul 19, 2017 at 12:40 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

The “issue” raised is one word in a non-normative paragraph. It can be corrected with a one word change. Replace

"Valid JSON text is a subset of the ECMAScript PrimaryExpression syntax as modified by Step 4 above.”

with

"Valid JSON text is a variant of the ECMAScript PrimaryExpression syntax as modified by Step 4 above.”

+1. AFAICT, the only benefit "subset" gets us is that it's a bit easier to paste JSON into a .js file. I doubt a significant number of people are pasting JSON with the problematic codepoints but if there are that can be addressed by js-mode.el or your favorite analogue.

# Mark Miller (7 years ago)

that one word change would correct the inconsistency. I agree that it is the most that is likely to happen. I would prefer that we actually fix JS so that it is a superset of JSON, as I preferred during our es3.1 days. However, I agree that this is a) even less likely to happen now than it was then, and b) matters less now than it would have then, since no one has any remaining excuse for using eval to parse JSON. Assuming that we are in fact stuck, I agree that we should make that one word change.

Nevertheless, I would still prefer that we fix JS so that it is a superset of JSON. I can't say that I care a lot. But if someone does, it would be good if they could gather statistics on how much existing JS code would break. Simply scanning for the total number of *.js files in github that contain a literal \u2028 or \u2029 would give a nice upper bound on that. I suspect that even this upper bound would be shockingly tiny; but I do not actually know.

# Mark Miller (7 years ago)

The benefit that I care about more:

"The following regularity holds between A and B except for this tiny edge case."

is a much greater cognitive burden on everyone, forever, than

"The following regularity holds between A and B."

# Richard Gibson (7 years ago)