JSON numbers (was: Revisiting Decimal)
On Jan 15, 2009, at 6:07 PM, David-Sarah Hopwood wrote:
That approximation might be good enough for a particular use of JSON; if it isn't, then the application programmer should use a different binding, or different options to the binding.
To cut past all the preaching to the choir: the argument we were
having, apart from any confusions, is about what the default options
to the ES Harmony binding should be.
None of this dictates a specific answer to what 'typeof aDecimal' should return, but it needs to be clarified lest we do violence to JSON's intended semantics by making incorrect assumptions about it.
JSON's intended semantics may be arbitrary precision decimal (the RFC
is neither explicit nor specific enough in my opinion; it mentions
only "range", not "precision"), but not all real-world JSON codecs use
arbitrary precision decimal, and in particular today's JS codecs use
IEEE double binary floating point. This "approximates" by default and
creates a de-facto standard that can't be compatibly extended without
opt-in.
Brendan Eich wrote:
Long ago in the '80s there was an RPC competition between Sun and Apollo (defunct Massachusetts-based company, but the RPC approach ended up in DCE), with both sides attempting to use open specs and even open source to build alliances. Bob Lyons of Sun argued eloquently for one standard type system for senders and receivers. Paul (I forget his last name) of Apollo argued for "receiver makes it right" to allow the funky non-IEEE-754 floating point formats of the day to be used at the convenience of the sender. E.g. Cray and DEC VAX senders would not have to transcode to the IEEE-754 lingua franca, they could just blat out the bytes in a number in some standard byte order.
Bob Lyon's rebuttal as I recall it was two-fold: 1. "receiver makes it right" is really "receiver makes it wrong", because the many types and conversions are a fertile field for bugs and versionitis problems among peers. 2. There should be indeed be a lingua franca -- you should be able to put serialized data in a data vault for fifty years and hope to read it, so long as we're still around, without having to know which sender sent it, what floating point format that obsolete sender used, etc.
The two points are distinct. Complexity makes for versionitis and bug habitat, but lack of a single on-the-wire semantic standard makes for a Tower of Babel scenario, which means data loss. Fat, buggy, and lossy are no way to go through college!
See www.kohala.com/start/papers.others/rpc.comments.txt for more.
This analogy fails to reflect the design of JSON.
JSON does specify a common encoding and type of numbers: arbitrary-precision decimal, encoded as a decimal string. This is very different from "blat[ting] out the bytes in a number in some standard byte order". (The difference is not text vs binary; it's using a standardized encoding rather than the sender's encoding.)
You can argue, if you want, that the common encoding and type shouldn't have been arbitrary-precision decimal. But the JSON RFC says what it says, and for better or worse, it does not mention any binary floating point types, or have any way to explicitly specify multiple number representations. ES-Harmony must support JSON as it is designed; changing JSON is not in scope for TC39, and is quite unlikely to happen independently.
It is true that if the designer of a JSON-based format wants to simplify its use across languages, then it would be helpful to implementors of the format, all else being equal, if it did not assume that numbers will be preserved with greater precision or range than an IEEE double.
However, if a JSON-based format needs numbers that are preserved to greater precision or range than IEEE double, then JSON numbers are still a perfectly suitable type to encode them (the only vaguely reasonable alternative would be to encode the numbers as strings, which doesn't solve anything; it just makes for a more obscure use of JSON).
In that case, implementors of the format must find a JSON language binding that supports a type with the required precision and range -- but such bindings are widely available for languages that have such types (decimal or otherwise) built-in or supported by a standard library. Presumably, ES-Harmony will be such a language, and its JSON library will be extended to have an option to decode JSON numbers to decimals.
(ES-Harmony's proposed decimal type still isn't arbitrary-precision, of course, but it will have sufficient precision and range to support a larger set of uses than IEEE double.)
We are not condemned to repeat history if we pay attention to what went before. JSON implementations in future ES specs cannot by default switch either encoding or decoding to use decimal instead of number.
Of course not, but they can easily provide a non-default switch to do so. They can also encode both ES numbers and ES decimals to JSON numbers, as Kris has already indicated that Dojo plans to do. (This encoding is lossy in general, but not necessarily lossy for a given use of JSON.)
JSON does not follow the path of other formats that attempt to dictate tight language-type couplings. In all cases, peers can ultimately choose how they want to internally handle the data provided by JSON. JSON is pure data format, not computation prescription and won't dictate how computations are performed.
Yeah, yeah -- I know that and said as much. The argument is not about JSON as a syntax spec, it's about what we do in the implementation in ES3.1, where we have to make semantic choices. Including types and typeof, including future proofing.
AFAICS the most helpful thing in that respect would be to give programmers some expectation about how decimal is likely to be supported, so that they will have the maximum amount of time to future-proof their code, and it will be more likely that it Just Works when decimal starts to be supported by implementations. (Obviously they will still need to test on the implementations that support decimal first.)
On Thu, Jan 15, 2009 at 9:24 PM, Brendan Eich <brendan at mozilla.com> wrote:
JSON's intended semantics may be arbitrary precision decimal (the RFC is neither explicit nor specific enough in my opinion; it mentions only "range", not "precision"), but not all real-world JSON codecs use arbitrary precision decimal, and in particular today's JS codecs use IEEE double binary floating point. This "approximates" by default and creates a de-facto standard that can't be compatibly extended without opt-in.
You might find the next link enlightening or perhaps even a pleasant diversion:
www.intertwingly.net/stories/2002/02/01/toInfinityAndBeyondTheQuestForSoapInteroperability.html
Quick summary as it applies to this discussion: perfection is unattainable (duh!) and an implementation which implements JSON numbers as quad decimal will retain more precision than one that implements JSON numbers as double binary (duh!).
- Sam Ruby
David-Sarah Hopwood wrote:
Brendan Eich wrote:
We are not condemned to repeat history if we pay attention to what went before. JSON implementations in future ES specs cannot by default switch either encoding or decoding to use decimal instead of number.
Of course not, but they can easily provide a non-default switch to do so.
I meant the preceding sentence to apply to decoding. It is simply incorrect to say that JSON implementations in future ES specs could not encode ES decimals as JSON numbers.
On Jan 15, 2009, at 7:16 PM, David-Sarah Hopwood wrote:
See www.kohala.com/start/papers.others/rpc.comments.txt for
more.This analogy fails to reflect the design of JSON.
The design of JSON is fine in isolation from the "facts on the ground".
You seem to be mistaking my argument against Kris Zyp's position that
we can add decimal and encode it by default as an attack on JSON. But
since, as you note here...
JSON does specify a common encoding and type of numbers: arbitrary-precision decimal, encoded as a decimal string.
... JSON does have a (underspecified in my opinion, and clearly so
based on so many codecs ignoring it) single format -- arbitrary
precision decimal -- it does avoid "receiver makes it wrong" -- in
theory. The lingua franca is BigDecimal or equivalent (not IEEE 754r).
Too bad no JS based JSON codec, and no ES3.1 draft spec, requires
arbitrary precision decimal as the number type used when encoding and
decoding. This makes it impossible to evolve implementations without
opt-in.
This is very different from "blat[ting] out the bytes in a number in some
standard byte order". (The difference is not text vs binary; it's using a
standardized encoding rather than the sender's encoding.)
You're preaching to the choir. My argument was against implicit sender-
type-based encoding, which is what we have today for all JS codecs I
know of.
You can argue, if you want, that the common encoding and type
shouldn't have been arbitrary-precision decimal.
I never argued that.
But the JSON RFC says what it says, and for better or worse, it does not mention any binary floating point types, or have any way to explicitly specify multiple number representations.
Yes, I know.
ES-Harmony must support JSON as it is designed; changing JSON is not in scope for TC39, and is quite unlikely to
happen independently.
I never proposed to change JSON. Why are you arguing against that here?
We are not condemned to repeat history if we pay attention to what
went before. JSON implementations in future ES specs cannot by default
switch either encoding or decoding to use decimal instead of number.Of course not, but they can easily provide a non-default switch to do so.
That's necessary, and what I clearly advocated. I feel like tapping
the mic ("is this thing on..." tap tap), since somehow you've
managed to spend most of your reply arguing against positions I've not
taken.
They can also encode both ES numbers and ES decimals to JSON numbers, as Kris has already indicated that Dojo plans to do.
Dojo can do whatever it likes, but Harmony should not specify this by
default.
On Jan 15, 2009, at 7:46 PM, David-Sarah Hopwood wrote:
David-Sarah Hopwood wrote:
Brendan Eich wrote:
We are not condemned to repeat history if we pay attention to what
went before. JSON implementations in future ES specs cannot by default
switch either encoding or decoding to use decimal instead of number.Of course not, but they can easily provide a non-default switch to do so.
I meant the preceding sentence to apply to decoding. It is simply incorrect to say that JSON implementations in future ES specs could not encode ES decimals as JSON numbers.
It is simply bad manners to assert without proof.
What happens when future codecs send decimals that round badly or lose
precision fatally to older codecs? You can say those are just bugs to
be fixed by someone, but that avoids responsibility for avoiding the
situation in the first place.
You are assuming that "approximating" decimals encoded in JSON but
decoded into doubles is acceptable. I don't think that's a safe
assumption.
On Jan 15, 2009, at 7:28 PM, Sam Ruby wrote:
On Thu, Jan 15, 2009 at 9:24 PM, Brendan Eich <brendan at mozilla.com>
wrote:JSON's intended semantics may be arbitrary precision decimal (the
RFC is neither explicit nor specific enough in my opinion; it mentions only "range", not "precision"), but not all real-world JSON codecs use
arbitrary precision decimal, and in particular today's JS codecs use IEEE
double binary floating point. This "approximates" by default and creates a
de-facto standard that can't be compatibly extended without opt-in.You might find the next link enlightening or perhaps even a pleasant
diversion:www.intertwingly.net/stories/2002/02/01/toInfinityAndBeyondTheQuestForSoapInteroperability.html
Quick summary as it applies to this discussion: perfection is unattainable (duh!) and an implementation which implements JSON numbers as quad decimal will retain more precision than one that implements JSON numbers as double binary (duh!).
DuhT^2 ;-).
But more than that: discounting the plain fact that on the web at
least, SOAP lost to JSON (Google dropped its SOAP APIs a while ago),
do you draw any conclusions?
My conclusion, crustier and ornier as I age, is that mixed-mode
arithmetic with implicit conversions and "best effort" "approximation"
is a botch and a blight. That's why I won't have it in JSON, encoding
and decoding.
Brendan Eich wrote:
On Jan 15, 2009, at 7:28 PM, Sam Ruby wrote:
On Thu, Jan 15, 2009 at 9:24 PM, Brendan Eich <brendan at mozilla.com> wrote:
JSON's intended semantics may be arbitrary precision decimal (the RFC is neither explicit nor specific enough in my opinion; it mentions only "range", not "precision"), but not all real-world JSON codecs use arbitrary precision decimal, and in particular today's JS codecs use IEEE double binary floating point. This "approximates" by default and creates a de-facto standard that can't be compatibly extended without opt-in.
You might find the next link enlightening or perhaps even a pleasant diversion:
www.intertwingly.net/stories/2002/02/01/toInfinityAndBeyondTheQuestForSoapInteroperability.html
Quick summary as it applies to this discussion: perfection is unattainable (duh!) and an implementation which implements JSON numbers as quad decimal will retain more precision than one that implements JSON numbers as double binary (duh!).
DuhT^2 ;-).
But more than that: discounting the plain fact that on the web at least, SOAP lost to JSON (Google dropped its SOAP APIs a while ago), do you draw any conclusions?
My conclusion, crustier and ornier as I age, is that mixed-mode arithmetic with implicit conversions and "best effort" "approximation" is a botch and a blight. That's why I won't have it in JSON, encoding and decoding.
My age differs from your by a mere few months.
My point was not SOAP specific, but dealt with interop of such things as dates and dollars in a cross-platform setting.
My conclusion is that precision is perceived as a quality of implementation issue. The implementations that preserve the most precision are perceived to be of higher quality than those that don't.
I view any choice which views binary64 as preferable to decimal128 as choosing both botch and blight.
Put another way, if somebody sends you a quantity and you send back the same quantity (i.e., merely round-trip the data), the originator will see it as being unchanged if their (the originator's) precision is less than or equal to the partner in this exchange. This leads to an natural ordering of implementations from most-compatible to least.
A tangible analogy that might make sense to you, and might not. Ever try rsync'ing to a Windows box? Rsync from windows to windows works just fine. Unix to unix also. As does Windows->Unix->Windows. But
Unix->Windows->Unix needs fudge parameters. Do you really want to the
the Windows in this equation? :-)
- Sam Ruby
P.S. You asked my opinion, and I've given it. This is something I have an opinion on, but not something I view as an egregious error if the decision goes the other way.
Brendan Eich wrote:
There seems to be a misconception here.
JSON is a general-purpose data description language. It happens to have a syntax that is almost (see below) a subset of ECMAScript's syntax, but it was explicitly designed to be useful across languages, and for data formats that are specified independently of programming language. The de facto bindings of JSON to other languages are therefore just as relevant as its bindings to ECMAScript, in determining what its de facto semantics are.
One of the ways in which JSON syntax is not a subset of ECMAScript syntax, is its definition of numbers. JSON numbers are effectively arbitrary-precision decimals: if you change a JSON number in a given document at any decimal place, then you are changing the meaning of the document, even if the number would round to the same IEEE double value.
The fact that some language bindings may (with specified options) implicitly round numbers to the language's IEEE double type when parsing a JSON document, does not contradict this. In doing so, the binding is giving an approximation of the meaning of the document. That approximation might be good enough for a particular use of JSON; if it isn't, then the application programmer should use a different binding, or different options to the binding.
None of this dictates a specific answer to what 'typeof aDecimal' should return, but it needs to be clarified lest we do violence to JSON's intended semantics by making incorrect assumptions about it.