Arrays with non-integer properties

# Serge LE HUITOUZE (18 years ago)

Hello there,

I have a question that I tried to solve on a more applicative mailing-list, but which seem not to interest anybody there ;-)

That's why I'll give it a try on this forum.

Sorry if it's not exactly the right forum, I'm not familiar with EcmaScript and its various brands, not to mention associated discussion forums/lists. I'm confident, however, that I will find knowledgeable advice here (or, at least, redirection to appropriate discussion).

My questions arise from the last version of the W3C SISR standard (www.w3.org/TR/semantic-interpretation). It is used to normalize how one writes (ECMA-327) code to build semantic information from a voice grammar.

More precisely, my question is about serialization. A chapter of the afore-mentionned standard is devoted to serializing an ECMA object in an XML form (a.k.a." EMMA form" in SISR wording). Along this chapter, interrogations come into play as to how some ECMA objects are serialized in "ECMA" format: It seems it's such an obvious (sub-)problem in the context of SISR standard that it is not even mentionned what governs this serialisation. I make the hypothesis that this format is the same as the way one writes a literal value in an ECMA program, though I can't get any confirmationon w3.voice forum...

[BTW, I don't know if ECMA's literal syntax is exactly the same as JSON, I'm interested in any clarification from you on this particular point.]

At some point of SISR chapter 7, more precisely item 5 in section 7.1, I find the following wording concerning EMMA (i.e. XML) serialization of ECMA arrays: "Any other properties of an Array object, for instance the keys of an associative array (e.g. a["prop"]), are subject to the same transformation rules as the regular properties of an object. In a sparse array, only those elements which hold defined values will be serialized."

Though this text describes the XML serialization, it clearly has implication on ECMA serialization as well: Indeed, it asserts as perfectly acceptable that an array object can have non-integer properties in addition to "regular" index properties. The question is then: How do you ECMA-serialize an array object having, in addition to index properties, non-integer properties?

From what I understand of ECMA-262 (which is little...) the following SISR fragment: ** v1=new Array; v1.push("A"); v1[3]="B"; $.v1=v1; should yield the following ECMA serialization: ** {v1:["A",,,"B"]} So would the equivalent ECMA fragment using the literal syntax: ** v1=["A",,,"B"]; $.v1=v1;

This is (hopefully) straightforward.

But then, what would the following SISR fragment yield? ** v1=new Array; v1.push("A"); v1[3]="B"; v1["prop"]="C"; $.v1=v1;

From my (short, but nevertheless painful) search in ECMA reference document,

I was not able to find how you can specify literally an ECMA array with additional non-integer properties.

The only answer I got on "w3.voice" was one suggesting to use the object notation to create "something" that mixes integer and non-integer properties. However, this "something" is nothing more than a regular object (though slightly unusual, since it has integer properites), not an array object with non-integer properties. It is thus obviously (at least, that's how I analyse it) not the way one should serialize such an array.

Comments anyone?

Thanks in advance.

--Serge Le Huitouze

No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.488 / Virus Database: 269.14.10/1070 - Release Date: 14/10/2007 09:22

# liorean (18 years ago)

On 15/10/2007, Serge LE HUITOUZE <slehuitouze at telisma.com> wrote:

More precisely, my question is about serialization. A chapter of the afore-mentionned standard is devoted to serializing an ECMA object in an XML form (a.k.a." EMMA form" in SISR wording). Along this chapter, interrogations come into play as to how some ECMA objects are serialized in "ECMA" format: It seems it's such an obvious (sub-)problem in the context of SISR standard that it is not even mentionned what governs this serialisation. I make the hypothesis that this format is the same as the way one writes a literal value in an ECMA program, though I can't get any confirmationon w3.voice forum...

The syntax of literals in ECMAScript 3 is far from capable of representing the full set of object structures ECMAScript can contain. Some examples would include:

  • Function literals cannot represent the closure of a function object
  • the scope chain goes missing when serialising.
  • Host methods, objects and constructors may be serialisable but in practice are not.
  • Wrappers for primitives have no literal syntax at all.
  • Circular references on objects are not allowed in literals. (But moz has sharp variables for this.)
  • Getter and setter functionality is not serialisable.
  • Enumerability, removability, writability for properties are not serialisable.
  • Prototype links are not serialisable.
  • Array literals don't allow for properties with arbitrary non-uint32 names.
  • Arrays may contain properties whose name is so high (e.g. 0xfffffffe) that serialising the entire array as a string with comma separated elisions would require all of physical memory and then some.

Illustrating this last point - try this example (displaying the results in a text node, so tries to serialise the array as a string):

var array=[];
array[0xfffffffe]=0xfffffffe;
array;

On my system it leads to:

  • Saf eating all RAM, then paging to disk, and then crashing.
  • Op eating memory up to a point (~1GiB in my test), then silently fails without error.
  • Ie stringify as "undefined" but otherwise behave reasonably.
  • Moz... very slowly eats successively more and more ram, pegs the CPU, and is otherwise entirely unresponsive. (Didn't want to wait it out - in a quarter of an hour it's slowly grown from ~100MiB to ~300 MiB without showing any tendency to stop growing. But the growth curve is markedly jagged, one second consuming 50 MiB more than the next second.)

[BTW, I don't know if ECMA's literal syntax is exactly the same as JSON, I'm interested in any clarification from you on this particular point.]

JSON is a subset of ECMAScript literals. It doesn't deal with identifiers, reserved names, hexadecimal literals, optionally octal literals, function literals, regexp literals etc.

At some point of SISR chapter 7, more precisely item 5 in section 7.1, I find the following wording concerning EMMA (i.e. XML) serialization of ECMA arrays: "Any other properties of an Array object, for instance the keys of an associative array (e.g. a["prop"]), are subject to the same transformation rules as the regular properties of an object. In a sparse array, only those elements which hold defined values will be serialized."

That text is a bit wrong. It's the element and not the value that is defined or not. A defined element can have undefined as it's value. Serialisation should differentiate the two, because the elements that are not defined can be elided in the serialisation while the elements that are defined cannot be elided.

Though this text describes the XML serialization, it clearly has implication on ECMA serialization as well: Indeed, it asserts as perfectly acceptable that an array object can have non-integer properties in addition to "regular" index properties.

Nothing surprising there. In fact, ECMAScript 3 itself does this: the return arrays from regex matches have named properties.

The question is then: How do you ECMA-serialize an array object having, in addition to index properties, non-integer properties?

There's no standard ECMAScript serialisation for this. It's not covered by any literal syntax, you need a program snippet to represent it.

From my (short, but nevertheless painful) search in ECMA reference document, I was not able to find how you can specify literally an ECMA array with additional non-integer properties.

You cannot.

The only answer I got on "w3.voice" was one suggesting to use the object notation to create "something" that mixes integer and non-integer properties. However, this "something" is nothing more than a regular object (though slightly unusual, since it has integer properites), not an array object with non-integer properties. It is thus obviously (at least, that's how I analyse it) not the way one should serialize such an array.

It cannot be serialised using JSON/ES3 literal syntax. Not if you want to retain both the array-ness and the properties with non-array-ish names.

# Serge LE HUITOUZE (18 years ago)

David "liorean" Andersson wrote:

At some point of SISR chapter 7, more precisely item 5 in section 7.1, I find the following wording concerning EMMA (i.e. XML) serialization of ECMA arrays: "Any other properties of an Array object, for instance the keys of an associative array (e.g. a["prop"]), are subject to the same transformation rules as the regular properties of an object. In a sparse array, only those elements which hold defined values will be serialized."

That text is a bit wrong. It's the element and not the value that is defined or not. A defined element can have undefined as it's value. Serialisation should differentiate the two, because the elements that are not defined can be elided in the serialisation while the elements that are defined cannot be elided.

It's yet another example of the sloppiness of SISR standard when it comes to its connections with ECMA...

Upon reading, it is indeed unclear whether the spec is simply nonsensical, or whether it's dealing with properties/elements having the special value "undefined"...

[...] It is thus obviously (at least, that's how I analyse it) not the way one should serialize such an array.

It cannot be serialised using JSON/ES3 literal syntax. Not if you want to retain both the array-ness and the properties with non-array-ish names.

I find this pretty embarrassing: the standard discusses how to serialize in XML format the "non-array-ish" properties of an array, but it so happens that it's impossible to render them in ECMA format!

--Serge

No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.488 / Virus Database: 269.14.10/1070 - Release Date: 14/10/2007 09:22