Summary of Input. Re: JSON.canonicalize()

# Anders Rundgren (6 years ago)

Scott A: en.wikipedia.org/wiki/Security_level "For example, SHA-256 offers 128-bit collision resistance" That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect. Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.

Richard G: Is the [highly involuntary] "inspiration" to the JSON.canonicalize() proposal: www.ietf.org/mail-archive/web/json/current/msg04257.html Why not fork your go library? Then there would be three implementations!

Mike S: Wants to build a 2000+ line standalone JSON canonicalizer working on string data. That's great but I think that it will be a hard sell getting these guys accept the Pull Request: developers.google.com/v8 JSON.canonicalize(JSON.parse("json string data to be canonicalized")) would IMHO do the same job. My (working) code example was only provided to show the principle as well as being able to test/verify.

On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools. A single line did 99% of the job: cyberphone/openkeystore/blob/jose-compatible/library/src/org/webpki/json/JSONObjectWriter.java#L928

for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet()) : object.properties.keySet()) {

Other mentioned issues like HTML safety, embedded nulls etc. would apply to JSON.stringify() as well. JSON.canonicalize() would inherit all the features (and weaknesses) of JSON.stringify().

thanx, Anders

# Mike Samuel (6 years ago)

On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren < anders.rundgren.net at gmail.com> wrote:

Scott A: en.wikipedia.org/wiki/Security_level "For example, SHA-256 offers 128-bit collision resistance" That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect. Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.

Richard G: Is the [highly involuntary] "inspiration" to the JSON.canonicalize() proposal: www.ietf.org/mail-archive/web/json/current/msg04257.html Why not fork your go library? Then there would be three implementations!

Mike S: Wants to build a 2000+ line standalone JSON canonicalizer working on string data. That's great but I think that it will be a hard sell getting these guys accept the Pull Request: developers.google.com/v8 JSON.canonicalize(JSON.parse("json string data to be canonicalized")) would IMHO do the same job. My (working) code example was only provided to show the principle as well as being able to test/verify.

I don't know where you get the 2000+ line number. gist.github.com/mikesamuel/20710f94a53e440691f04bf79bc3d756 comes in at 80 lines. That's roughly twice as long as your demonstrably broken example code, but far shorter than the number you provided.

If you're being hyperbolic, please stop. If that was a genuine guesstimate, but you just happened to be off by a factor of 25, then I have less confidence that you can weigh the design complexity tradeoffs when comparing your's to other proposals.

On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools. A single line did 99% of the job: cyberphone/openkeystore/blob/jose-compati ble/library/src/org/webpki/json/JSONObjectWriter.java#L928

for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet())

: object.properties.keySet()) {

Other mentioned issues like HTML safety, embedded nulls etc. would apply to JSON.stringify() as well. JSON.canonicalize() would inherit all the features (and weaknesses) of JSON.stringify().

Please, when you attribute a summary to me, don't ignore the summary that I myself wrote of my arguments.

You're ignoring the context. JSON.canonicalize is not generally useful because it undoes safety precautions. That tied into one argument of mine that you left out: JSON.canonicalize is not generally useful. It should probably not be used as a wire or storage format, and is entirely unsuitable for embedding into other commonly used web application languages.

You also make no mention of backwards compatibility concerns when this depends on things like toJSON, which is hugely important when dealing with long lived hashes.

When I see that you've summarized my own thoughts incorrectly, even though I provided you with a summary of my own arguments, I lose confidence that you've correctly summarized other's positions.

# Mike Samuel (6 years ago)

On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren < anders.rundgren.net at gmail.com> wrote:

On my part I added canonicalization to my ES6.JSON compliant Java-based JSON tools. A single line did 99% of the job: cyberphone/openkeystore/blob/jose-compati ble/library/src/org/webpki/json/JSONObjectWriter.java#L928

for (String property : canonicalized ? new TreeSet<String>(object.properties.keySet()) : object.properties.keySet()) {

If this is what you want then can't you just use a replacer to substitute a record with sorted keys?

JSON.canonicalize = (value) => JSON.stringify(value, (_, value) => { if (value && typeof value === 'object' && !Array.isArray(value)) { const withSortedKeys = {} const keys = Object.getOwnPropertyNames(value) keys.sort() keys.forEach(key => withSortedKeys[key] = value[key]) value = withSortedKeys } return value })

# Anders Rundgren (6 years ago)

Pardon me if you think I was hyperbolic, The discussion got derailed by the bogus claims about hash functions' vulnerability.

F.Y.I: Using ES6 serialization methods for JSON primitive types is headed for standardization in the IETF. www.ietf.org/mail-archive/web/jose/current/msg05716.html

This effort is backed by one of the main authors behind the current de-facto standard for Signed and Encrypted JSON, aka JOSE. If this is in your opinion is a bad idea, now is the right time to shoot it down :-)

This efforts also exploits the ability of JSON.parse() and JSON.stringify() honoring object "Creation Order".

JSON.canonicalize() would be a "Sorting" alternative to "Creation Order" offering certain advantages with limiting deployment impact to JSON serializers as the most important one.

The ["completely broken"] sample code was only submitted as a proof-of-concept. I'm sure you JS gurus can do this way better than I :-)

Creating an alternative based on [1,2,3] seems like a rather daunting task.

Thanx, Anders cyberphone/json-canonicalization

1] wiki.laptop.org/go/Canonical_JSON 2] gibson042.github.io/canonicaljson-spec 3] gist.github.com/mikesamuel/20710f94a53e440691f04bf79bc3d756

# Mike Samuel (6 years ago)

On Sun, Mar 18, 2018 at 2:14 AM, Anders Rundgren < anders.rundgren.net at gmail.com> wrote:

Hi Guys,

Pardon me if you think I was hyperbolic, The discussion got derailed by the bogus claims about hash functions' vulnerability.

I didn't say I "think" you were being hyperbolic. I asked whether you were.

You asserted a number that seemed high to me. I demonstrated it was high by a factor of at least 25 by showing an implementation that used 80 lines instead of the 2000 you said was required.

If you're going to put out a number as a reason to dismiss an argument, you should own it or retract it. Were you being hyperbolic? (Y/N)

Your claim and my counterclaim are in no way linked to hash function vulnerability. I never weighed in on that claim and have already granted that hashable JSON is a worthwhile use case.

F.Y.I: Using ES6 serialization methods for JSON primitive types is headed for standardization in the IETF. www.ietf.org/mail-archive/web/jose/current/msg05716.html

This effort is backed by one of the main authors behind the current de-facto standard for Signed and Encrypted JSON, aka JOSE. If this is in your opinion is a bad idea, now is the right time to shoot it down :-)

Does this main author prefer your particular JSON canonicalization scheme to others? Is this an informed opinion based on flaws in the others that make them less suitable for JOSE's needs that are not present in the scheme you back?

If so, please provide links to their reasoning. If not, how is their backing relevant?

This efforts also exploits the ability of JSON.parse() and JSON.stringify() honoring object "Creation Order".

JSON.canonicalize() would be a "Sorting" alternative to "Creation Order" offering certain advantages with limiting deployment impact to JSON serializers as the most important one.

The ["completely broken"] sample code was only submitted as a proof-of-concept. I'm sure you JS gurus can do this way better than I :-)

This is a misquote. No-one has said your sample code was completely broken. Neither your sample code nor the spec deals with toJSON. At some point you're going to have to address that if you want to keep your proposal moving forward. No amount of JS guru-ry is going to save your sample code from a specification bug.

Creating an alternative based on [1,2,3] seems like a rather daunting task.

Maybe if you spend more time laying out the criteria on which a successful proposal should be judged, we could move towards consensus on this claim.

As it is, I have only your say so but I have reason to doubt your evaluation of task complexity unless you were being hyperbolic before.

# Anders Rundgren (6 years ago)

On 2018-03-18 15:13, Mike Samuel wrote:

On Sun, Mar 18, 2018 at 2:14 AM, Anders Rundgren <anders.rundgren.net at gmail.com <mailto:anders.rundgren.net at gmail.com>> wrote:

Hi Guys,

Pardon me if you think I was hyperbolic,
The discussion got derailed by the bogus claims about hash functions' vulnerability.

I didn't say I "think" you were being hyperbolic.  I asked whether you were.

You asserted a number that seemed high to me. I demonstrated it was high by a factor of at least 25 by showing an implementation that used 80 lines instead of the 2000 you said was required.

If you're going to put out a number as a reason to dismiss an argument, you should own it or retract it. Were you being hyperbolic?  (Y/N)

N. To be completely honest I have only considered fullblown serializers and they typically come in the mentioned size.

Your solution have existed a couple days; we may need a little bit more time thinking about it :-)

Your claim and my counterclaim are in no way linked to hash function vulnerability. I never weighed in on that claim and have already granted that hashable JSON is a worthwhile use case.

Great! So we can finally put that argument to rest.

F.Y.I: Using ES6 serialization methods for JSON primitive types is headed for standardization in the IETF.
https://www.ietf.org/mail-archive/web/jose/current/msg05716.html <https://www.ietf.org/mail-archive/web/jose/current/msg05716.html>

This effort is backed by one of the main authors behind the current de-facto standard for Signed and Encrypted JSON, aka JOSE.
If this is in your opinion is a bad idea, now is the right time to shoot it down :-)

Does this main author prefer your particular JSON canonicalization scheme to others?

This proposal does [currently] not rely on canonicalization but on ES6 "predictive parsing and serialization".

Is this an informed opinion based on flaws in the others that make them less suitable for JOSE's needs that are not present in the scheme you back?

A JSON canonicalization scheme has AFAIK never been considered in the relevant IETF groups (JOSE+JSON). On the contrary, it has been dismissed as a daft idea.

I haven't yet submitted my [private] I-D. I'm basically here for collecting input and finding possible collaborators.

If so, please provide links to their reasoning. If not, how is their backing relevant?

If ES6/JSON.stringify() way of serializing JSON primitives becomes an IETF standard with backed by Microsoft, it may have an impact on the "market".

This efforts also exploits the ability of JSON.parse() and JSON.stringify() honoring object "Creation Order".

JSON.canonicalize() would be a "Sorting" alternative to "Creation Order" offering certain advantages with limiting deployment impact to JSON serializers as the most important one.

The ["completely broken"] sample code was only submitted as a proof-of-concept. I'm sure you JS gurus can do this way better than I :-)

This is a misquote.  No-one has said your sample code was completely broken. Neither your sample code nor the spec deals with toJSON.  At some point you're going to have to address that if you want to keep your proposal moving forward.

It is possible that I don't understand what you are asking for here since I have no experience with toJSON.

Based on this documentation developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify JSON.canonicalize() would though work out of the box (when integrated in the JSON object NB...) since it would inherit all the functionality (and 99% of the code) of JSON.stringify()

No amount of JS guru-ry is going to save your sample code from a specification bug.

Creating an alternative based on [1,2,3] seems like a rather daunting task.

Maybe if you spend more time laying out the criteria on which a successful proposal should be judged, we could move towards consensus on this claim.

Since you have already slashed my proposal there is probably not so much consensus to find...

Anders

As it is, I have only your say so but I have reason to doubt your evaluation of task complexity unless you were being hyperbolic before.

It is a free world, you may doubt my competence, motives, whatever.

# C. Scott Ananian (6 years ago)

On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren < anders.rundgren.net at gmail.com> wrote:

Scott A: en.wikipedia.org/wiki/Security_level "For example, SHA-256 offers 128-bit collision resistance" That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect. Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.

Where do you specify SHA-256 signatures in your standard?

If one were to use MD5 signatures, they would indeed break in the way I describe.

It is good security practice to assume that currently-unbroken algorithms may eventually break in similar ways to discovered flaws in older algorithms. But in any case, it is simply not good practice to allow multiple valid representations of content, if your aim is for a "canonical' representation.

# Anders Rundgren (6 years ago)

On 2018-03-18 19:08, C. Scott Ananian wrote:

On Fri, Mar 16, 2018 at 9:42 PM, Anders Rundgren <anders.rundgren.net at gmail.com <mailto:anders.rundgren.net at gmail.com>> wrote:

Scott A:
https://en.wikipedia.org/wiki/Security_level <https://en.wikipedia.org/wiki/Security_level>
"For example, SHA-256 offers 128-bit collision resistance"
That is, the claims that there are cryptographic issues w.r.t. to Unicode Normalization are (fortunately) incorrect.
Well, if you actually do normalize Unicode, signatures would indeed break, so you don't.

Where do you specify SHA-256 signatures in your standard?

If one were to use MD5 signatures, they would indeed break in the way I describe.

It is good security practice to assume that currently-unbroken algorithms may eventually break in similar ways to discovered flaws in older algorithms.  But in any case, it is simply not good practice to allow multiple valid representations of content, if your aim is for a "canonical' representation.

Other people could chime in on this since I have already declared my position on this topic. BTW, my proposal comes without cryptographic algorithms.

Does Unicode Normalization [naturally] belong to the canonicalization issue we are currently discussing? I didn't see any of that in Richard's and Mike's specs. at least.

Anders

# C. Scott Ananian (6 years ago)

IMO it belongs, at the level of a SHOULD recommendation when the data represented is intended to be a Unicode string. (But not a MUST because neither Javascript's 16-bit strings nor the 8-bit JSON representation necessarily represent Unicode strings.)

But I've said this already.

# Mike Samuel (6 years ago)

On Sun, Mar 18, 2018 at 12:50 PM, Anders Rundgren < anders.rundgren.net at gmail.com> wrote:

On 2018-03-18 15:13, Mike Samuel wrote:

On Sun, Mar 18, 2018 at 2:14 AM, Anders Rundgren < anders.rundgren.net at gmail.com <mailto:anders.rundgren.net at gmail.com>> wrote:

Hi Guys,

Pardon me if you think I was hyperbolic,
The discussion got derailed by the bogus claims about hash functions'

vulnerability.

I didn't say I "think" you were being hyperbolic. I asked whether you were.

You asserted a number that seemed high to me. I demonstrated it was high by a factor of at least 25 by showing an implementation that used 80 lines instead of the 2000 you said was required.

If you're going to put out a number as a reason to dismiss an argument, you should own it or retract it. Were you being hyperbolic? (Y/N)

N. To be completely honest I have only considered fullblown serializers and they typically come in the mentioned size.

Your solution have existed a couple days; we may need a little bit more time thinking about it :-)

Fair enough.

Your claim and my counterclaim are in no way linked to hash function

vulnerability. I never weighed in on that claim and have already granted that hashable JSON is a worthwhile use case.

Great! So we can finally put that argument to rest.

No. I don't disagree with you, but I don't speak for whoever did.

F.Y.I: Using ES6 serialization methods for JSON primitive types is

headed for standardization in the IETF. www.ietf.org/mail-archive/web/jose/current/msg05716.html < www.ietf.org/mail-archive/web/jose/current/msg05716.html>

This effort is backed by one of the main authors behind the current

de-facto standard for Signed and Encrypted JSON, aka JOSE. If this is in your opinion is a bad idea, now is the right time to shoot it down :-)

Does this main author prefer your particular JSON canonicalization scheme to others?

This proposal does [currently] not rely on canonicalization but on ES6 "predictive parsing and serialization".

Is this an informed opinion based on flaws in the others that make them

less suitable for JOSE's needs that are not present in the scheme you back?

A JSON canonicalization scheme has AFAIK never been considered in the relevant IETF groups (JOSE+JSON). On the contrary, it has been dismissed as a daft idea.

I haven't yet submitted my [private] I-D. I'm basically here for collecting input and finding possible collaborators.

If so, please provide links to their reasoning. If not, how is their backing relevant?

If ES6/JSON.stringify() way of serializing JSON primitives becomes an IETF standard with backed by Microsoft, it may have an impact on the "market".

If you can't tell us anything concrete about your backers, what they back, or why they back it, then why bring it up?

This efforts also exploits the ability of JSON.parse() and

JSON.stringify() honoring object "Creation Order".

JSON.canonicalize() would be a "Sorting" alternative to "Creation

Order" offering certain advantages with limiting deployment impact to JSON serializers as the most important one.

The ["completely broken"] sample code was only submitted as a

proof-of-concept. I'm sure you JS gurus can do this way better than I :-)

This is a misquote. No-one has said your sample code was completely broken. Neither your sample code nor the spec deals with toJSON. At some point you're going to have to address that if you want to keep your proposal moving forward.

It is possible that I don't understand what you are asking for here since I have no experience with toJSON.

Based on this documentation developer.mozilla.org/en-US/docs/Web/JavaScript/Refe rence/Global_Objects/JSON/stringify JSON.canonicalize() would though work out of the box (when integrated in the JSON object NB...) since it would inherit all the functionality (and 99% of the code) of JSON.stringify()

JSON.stringify(new Date()) has specific semantics because Date.prototype.toJSON has specific semantics. As currently written, JSON.canonicalize(new Date()) === JSON.canonicalize({})

No amount of JS guru-ry is going to save your sample code from a

specification bug.

Creating an alternative based on [1,2,3] seems like a rather daunting

task.

Maybe if you spend more time laying out the criteria on which a successful proposal should be judged, we could move towards consensus on this claim.

Since you have already slashed my proposal there is probably not so much consensus to find...

I didn't mean to slash anything.

I like parts of your proposal and dislike others. I talk more about the bits that I don't like because that's the purpose of this list. For example, I like that it treats strings as sequences of UTF-16 code units instead of trying to normalize strings that may not encode human readable text.

# Anders Rundgren (6 years ago)

On 2018-03-18 20:23, Mike Samuel wrote:

It is possible that I don't understand what you are asking for here since I have no experience with toJSON.

Based on this documentation
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify>
JSON.canonicalize() would though work out of the box (when integrated in the JSON object NB...) since it would inherit all the functionality (and 99% of the code) of JSON.stringify()

JSON.stringify(new Date()) has specific semantics because Date.prototype.toJSON has specific semantics. As currently written, JSON.canonicalize(new Date()) === JSON.canonicalize({})

It seems that you (deliberately?) misunderstand what I'm writing above.

JSON.canonicalize(new Date()) would do exactly the same thing as JSON.stringify(new Date()) since it apparently only returns a string.

Again, the sample code I provided is a bare bones solution with the only purpose showing the proposed canonicalization algorithm in code as a complement to the written specification.

Anders

# Mike Samuel (6 years ago)

On Sun, Mar 18, 2018, 4:00 PM Anders Rundgren <anders.rundgren.net at gmail.com>

wrote:

On 2018-03-18 20:23, Mike Samuel wrote:

It is possible that I don't understand what you are asking for here

since I have no experience with toJSON.

Based on this documentation

developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify < developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify

JSON.canonicalize() would though work out of the box (when

integrated in the JSON object NB...) since it would inherit all the functionality (and 99% of the code) of JSON.stringify()

JSON.stringify(new Date()) has specific semantics because Date.prototype.toJSON has specific semantics. As currently written, JSON.canonicalize(new Date()) === JSON.canonicalize({})

It seems that you (deliberately?) misunderstand what I'm writing above.

JSON.canonicalize(new Date()) would do exactly the same thing as JSON.stringify(new Date()) since it apparently only returns a string.

Where in the spec do you handle this case?

Again, the sample code I provided is a bare bones solution with the only

purpose showing the proposed canonicalization algorithm in code as a complement to the written specification.

Understood. AFAICT neither the text nor the instructional code treat Dates differently from an empty object.

# Anders Rundgren (6 years ago)

On 2018-03-18 21:06, Mike Samuel wrote:

On Sun, Mar 18, 2018, 4:00 PM Anders Rundgren <anders.rundgren.net at gmail.com <mailto:anders.rundgren.net at gmail.com>> wrote:

On 2018-03-18 20:23, Mike Samuel wrote:
 >     It is possible that I don't understand what you are asking for here since I have no experience with toJSON.
 >
 >     Based on this documentation
 > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify>
 >     JSON.canonicalize() would though work out of the box (when integrated in the JSON object NB...) since it would inherit all the functionality (and 99% of the code) of JSON.stringify()
 >
 >
 > JSON.stringify(new Date()) has specific semantics because Date.prototype.toJSON has specific semantics.
 > As currently written, JSON.canonicalize(new Date()) === JSON.canonicalize({})

It seems that you (deliberately?) misunderstand what I'm writing above.

JSON.canonicalize(new Date()) would do exactly the same thing as JSON.stringify(new Date()) since it apparently only returns a string.

Where in the spec do you handle this case?

It doesn't, it only describes a canonicalization algorithm.

Integration of the canonicalization algorithm in the ES JSON object might cost as much a 5 lines of code + some refactoring.

Anders

# Anders Rundgren (6 years ago)

On 2018-03-18 20:23, Mike Samuel wrote:

         F.Y.I: Using ES6 serialization methods for JSON primitive types is headed for standardization in the IETF.
    https://www.ietf.org/mail-archive/web/jose/current/msg05716.html <https://www.ietf.org/mail-archive/web/jose/current/msg05716.html> <https://www.ietf.org/mail-archive/web/jose/current/msg05716.html <https://www.ietf.org/mail-archive/web/jose/current/msg05716.html>>

         This effort is backed by one of the main authors behind the current de-facto standard for Signed and Encrypted JSON, aka JOSE.
         If this is in your opinion is a bad idea, now is the right time to shoot it down :-)


    Does this main author prefer your particular JSON canonicalization scheme to
    others?


This proposal does [currently] not rely on canonicalization but on ES6 "predictive parsing and serialization".


    Is this an informed opinion based on flaws in the others that make them less suitable for
    JOSE's needs that are not present in the scheme you back?


A JSON canonicalization scheme has AFAIK never been considered in the relevant IETF groups (JOSE+JSON).
On the contrary, it has been dismissed as a daft idea.

I haven't yet submitted my [private] I-D. I'm basically here for collecting input and finding possible collaborators.


    If so, please provide links to their reasoning.
    If not, how is their backing relevant?


If ES6/JSON.stringify() way of serializing JSON primitives becomes an IETF standard with backed by Microsoft, it may have an impact on the "market".

If you can't tell us anything concrete about your backers, what they back, or why they back it, then why bring it up?

Who they are, What they back, and Why the back it (Rationale), is in the referred document above. Here is a nicer HTML variant of the I-D: tools.ietf.org/id/draft-erdtman-jose-cleartext-jws-00.html

Anders