Allen Wirfs-Brock (2013-12-06T23:00:31.000Z)
On Dec 6, 2013, at 12:56 PM, Nico Williams wrote:

> On Fri, Dec 06, 2013 at 11:50:13AM -0800, Allen Wirfs-Brock wrote:
>> In practice, JSON is almost useless without schema level semantic
>> agreement between the producer and consumer of a JSON text. Most of
> 
> Yes.
> 
>> the issues we are discussing here are easily subsumed by such schema
>> level agreements.
> 
> Hmmm, well, there has to be some meta-schema.
> 
> That arrays preserve order is meta-schema for JSON, else we'd have no
> interop -- and this is critical for comparisons, so specifying this bit
> of meta-schema/ semantics enables very important semantics: arrays can
> be compared for equivalence without having to sort them (which would
> require further specification of collation for all JSON values!).

What "array" are you talking about.  The an 'array' symbol sequence in a JSON text? A language-specific array-like data structure generated from such a symbol sequence by the parser for a specific JSON language binding?  A domain data structure generate by a schema aware parser? 

Why shouldn't an schema be allowed to consider the following to be semantically equivalent:
      {"unordered-list": [0,1]}
and
      {"unordered-list": [1,0]}

Besides, we already agreed above that if you don't have schema-level agreement then JSON is almost useless.  So why not just let schema specifications or schema language specifications handle this. 

> 
> That whitespace (outside strings) is not significant may be expressed
> syntactically or semantically, but this has to be universally agreed if
> we'll have any chance of interoperating.

ECMA-404 states where insignificant whitespace is allowed. Is there any disagreement about this?

> 
> That object names (keys) are not dups is trickier: for on-line
> processing there may be no way to detect dups or do anything about them,
> but for many common implementations object name uniqueness is very much
> a requirement.  So here, finally, we have a bit of semantics that could
> be in the spec but could be left out (we spent a lot of time on the
> current consensus for RFC4627bis, and I think it's safe to say that
> we're all happy with it)

Should be left out.  Both because of legacy precedent and because it can be dealt with that a language binding or schema semantics specification.

But I think we are already in agreement on leaving this out at the static semantic level. 

> 
> That objects name order is irrelevant and non-deterministic is widely
> assumed/accepted (though often users ask that JSON filters preserve
> object name order from input to output).  (And, of course, for on-line
> encoders and parsers name order could well be made significant, but
> building schemas that demand ordered names means vastly limiting the
> world of JSON tooling that can be used to interoperably implement such
> schemas.)

Object name ordering is significant to widely used JSON language bindings (eg, the ECMA-262 JSON parser).  But again this is a semantic issue.

Because ECMA-404 is trying to restrict itself to describe the space of well-formed JSON text there really is nothing to say about object name ordering at that level. It's a semantic issue. 

> 
> Similarly for numbers, the *interoperable* number ranges and precisions
> are not really syntactic (they could be expressed via syntax, but it'd
> be a pain to do it that way).
> 
> I think it's clear that we have consensus in the IETF JSON WG for:
> 
> - whitespace semantics (not significant outside strings)
this is a syntactic issue that is covered by ECMA-404
> - array element order semantics (elements are "ordered")
> - object name dups/order semantics (names SHOULD be unique, but interop
>   considerations described; name/value pairs are "unordered")
> - no real constraints on numeric values but interoperable
>   range/precision described
the rest are semantic issues that ECMA-404 does not want to address.  The one place it arguably over steps, by saying that "the order of array values is significant", really has no associated semantics. This is one place where I prefer the current draft language in RFC-4627bis clause 5 over the corresponding language in ECMA-404. The Introduction (is the intro normative?)  to 4627bis says "an array is an ordered sequence" and  "an object is an unordered collection" but I don't see any actual contextual meaning given to either "ordered" or "unordered" within the document. 

> 
> If ECMA-404 differs in any way that does not impose more/different
> semantics, then maybe we don't care as far as RFC4627bis goes. If
> ECMA-404 does impose more/different semantics then we'll care a great
> deal.
If it does, that's unintended and a correctable bug in ECMA-404.
>  Since ECMA-404 targets just the syntax and minimal semantics,
> it's probably just fine for RFC4627bis to reference ECMA-404, but since
> RFC4627bis would be specifying a bit more semantics, we'd probably not
> want to make that reference be normative, at least not with some text
> excplaining that it's normative only because we believe that the JSON
> syntax given in both docs are equivalent.

The semantics you want to specify can be layered upon a normative reference to ECMA-404. Rather have competing and potentially divergence specifications we should be looking a clean separation of concerns. 

The position stated by TC39 that ECMA-404 already exists as a normative specification of the JSON syntax and we have requested that RFC4627bis normatively reference it as such and that any restatement of ECMA-404 subject matter should be marked as informative.  We think that dueling normative specifications would be a bad thing. Seeing that the form of expression used by ECMA-404 seems to be a issue for some JSON WG participants I have suggested that TC39 could probably be convinced to revise ECMA-404 to include a BNF style formalism for the syntax.  If there is interest in this alternative I'd be happy to champion it within TC39.

Allen
domenic at domenicdenicola.com (2013-12-10T00:48:56.032Z)
On Dec 6, 2013, at 12:56 PM, Nico Williams wrote:

> On Fri, Dec 06, 2013 at 11:50:13AM -0800, Allen Wirfs-Brock wrote:
>> In practice, JSON is almost useless without schema level semantic
>> agreement between the producer and consumer of a JSON text. Most of
> 
> Yes.
> 
>> the issues we are discussing here are easily subsumed by such schema
>> level agreements.
> 
> Hmmm, well, there has to be some meta-schema.
> 
> That arrays preserve order is meta-schema for JSON, else we'd have no
> interop -- and this is critical for comparisons, so specifying this bit
> of meta-schema/ semantics enables very important semantics: arrays can
> be compared for equivalence without having to sort them (which would
> require further specification of collation for all JSON values!).

What "array" are you talking about.  The an 'array' symbol sequence in a JSON text? A language-specific array-like data structure generated from such a symbol sequence by the parser for a specific JSON language binding?  A domain data structure generate by a schema aware parser? 

Why shouldn't an schema be allowed to consider the following to be semantically equivalent:

      {"unordered-list": [0,1]}

and

      {"unordered-list": [1,0]}

Besides, we already agreed above that if you don't have schema-level agreement then JSON is almost useless.  So why not just let schema specifications or schema language specifications handle this. 

> 
> That whitespace (outside strings) is not significant may be expressed
> syntactically or semantically, but this has to be universally agreed if
> we'll have any chance of interoperating.

ECMA-404 states where insignificant whitespace is allowed. Is there any disagreement about this?

> 
> That object names (keys) are not dups is trickier: for on-line
> processing there may be no way to detect dups or do anything about them,
> but for many common implementations object name uniqueness is very much
> a requirement.  So here, finally, we have a bit of semantics that could
> be in the spec but could be left out (we spent a lot of time on the
> current consensus for RFC4627bis, and I think it's safe to say that
> we're all happy with it)

Should be left out.  Both because of legacy precedent and because it can be dealt with that a language binding or schema semantics specification.

But I think we are already in agreement on leaving this out at the static semantic level. 

> 
> That objects name order is irrelevant and non-deterministic is widely
> assumed/accepted (though often users ask that JSON filters preserve
> object name order from input to output).  (And, of course, for on-line
> encoders and parsers name order could well be made significant, but
> building schemas that demand ordered names means vastly limiting the
> world of JSON tooling that can be used to interoperably implement such
> schemas.)

Object name ordering is significant to widely used JSON language bindings (eg, the ECMA-262 JSON parser).  But again this is a semantic issue.

Because ECMA-404 is trying to restrict itself to describe the space of well-formed JSON text there really is nothing to say about object name ordering at that level. It's a semantic issue. 

> 
> Similarly for numbers, the *interoperable* number ranges and precisions
> are not really syntactic (they could be expressed via syntax, but it'd
> be a pain to do it that way).
> 
> I think it's clear that we have consensus in the IETF JSON WG for:
> 
> - whitespace semantics (not significant outside strings)

this is a syntactic issue that is covered by ECMA-404
> - array element order semantics (elements are "ordered")
> - object name dups/order semantics (names SHOULD be unique, but interop
>   considerations described; name/value pairs are "unordered")
> - no real constraints on numeric values but interoperable
>   range/precision described

the rest are semantic issues that ECMA-404 does not want to address.  The one place it arguably over steps, by saying that "the order of array values is significant", really has no associated semantics. This is one place where I prefer the current draft language in RFC-4627bis clause 5 over the corresponding language in ECMA-404. The Introduction (is the intro normative?)  to 4627bis says "an array is an ordered sequence" and  "an object is an unordered collection" but I don't see any actual contextual meaning given to either "ordered" or "unordered" within the document. 

> 
> If ECMA-404 differs in any way that does not impose more/different
> semantics, then maybe we don't care as far as RFC4627bis goes. If
> ECMA-404 does impose more/different semantics then we'll care a great
> deal.

If it does, that's unintended and a correctable bug in ECMA-404.
>  Since ECMA-404 targets just the syntax and minimal semantics,
> it's probably just fine for RFC4627bis to reference ECMA-404, but since
> RFC4627bis would be specifying a bit more semantics, we'd probably not
> want to make that reference be normative, at least not with some text
> excplaining that it's normative only because we believe that the JSON
> syntax given in both docs are equivalent.

The semantics you want to specify can be layered upon a normative reference to ECMA-404. Rather have competing and potentially divergence specifications we should be looking a clean separation of concerns. 

The position stated by TC39 that ECMA-404 already exists as a normative specification of the JSON syntax and we have requested that RFC4627bis normatively reference it as such and that any restatement of ECMA-404 subject matter should be marked as informative.  We think that dueling normative specifications would be a bad thing. Seeing that the form of expression used by ECMA-404 seems to be a issue for some JSON WG participants I have suggested that TC39 could probably be convinced to revise ECMA-404 to include a BNF style formalism for the syntax.  If there is interest in this alternative I'd be happy to champion it within TC39.