More string indexing semantics issues

# Allen Wirfs-Brock (17 years ago)

Assuming the string index semantics I defined in my previous message, what should the effect of setting a numeric property on a string whose property name is a valid character position into the string?

For example: var s = new String("abc") s[1]; /* not valid in ES3, should yield "b" under extended semantics*/ s[1] = "x"; /* valid in ES3, but has nothing to do with the string value / s[1]; / both in ES3 and with extended semantics would yield "x"

Should the assignment, s[1] = "x", still be valid in the present if we support string indexing?

Argument for allowing: It's backwards compatible. There may be existing ES3 programs that depend upon it.

Argument for disallowing: Allowing characters of a string to be accessed using property access syntax makes the string elements appear as if they are actually properties. There appears to be a "joining" of s[1] and s.charAt[1}. Since the value of a string is immutable, any property that is "joined" to a an element of a string value should be a read-only property.

My inclination would be to disallow, but preserving existing code is one of our top priorities. What do the existing implementation that support indexed access to string elements do? If some of them disallow defining such properties then there is probably enough existing divergence that the compatibility issue doesn't really apply.

Second issue: I defined string index property access in such a way that such properties appear to be non-enumerable. Is this reasonable? It's inconsistent with arrays. However, saying that they are enumerable implies that for-in should iterate over string element indexes which it now doesn't do. Also what about meta operations like Object.prototype.hasOwnProperty? My current semantics would cause it report true for string element indexes. Is this reasonable? It's another set of incompatibilities.

# Garrett Smith (17 years ago)

2008/6/24 Allen Wirfs-Brock <Allen.Wirfs-Brock at microsoft.com>:

Assuming the string index semantics I defined in my previous message, what should the effect of setting a numeric property on a string whose property name is a valid character position into the string?

For example:

var s = new String("abc")

s[1];   /* not valid in ES3, should yield "b" under extended semantics*/

s[1]  = "x";  /* valid in ES3, but has nothing to do with the string

value */

s[1];    /*  both in ES3 and with extended semantics would yield "x"

Should the assignment, s[1] = "x", still be valid in the present if we support string indexing?

Argument for allowing: It's backwards compatible. There may be existing ES3 programs that depend upon it.

Argument for disallowing: Allowing characters of a string to be accessed using property access syntax makes the string elements appear as if they are actually properties. There appears to be a "joining" of s[1] and s.charAt[1}. Since the value of a string is immutable, any property that is "joined" to a an element of a string value should be a read-only property.

My inclination would be to disallow, but preserving existing code is one of our top priorities. What do the existing implementation that support indexed access to string elements do? If some of them disallow defining such properties then there is probably enough existing divergence that the compatibility issue doesn't really apply.

Webkit seems to have implemented a specialized [[Put]] for strings. When the property is numeric, the property setting fails silently. This can be observed by an example:-

javascript:(function(){var a = new String("Alan"); a[0] = "E"; alert(a[0]); })();

FF2: "E" Sf3: "A" OP9: "E" IE7: "E"

It seems to me that Webkit's behavior is a bug.

This is something that I blogged about last year. dhtmlkitchen.com/?category=/JavaScript/&date=2007/10/21/&entry=Iteration-Enumeration-Primitives-and-Objects#magic-string-source

Second issue:

I defined string index property access in such a way that such properties appear to be non-enumerable. Is this reasonable? It's inconsistent with arrays. However, saying that they are enumerable implies that for-in should iterate over string element indexes which it now doesn't do. Also what about meta operations like Object.prototype.hasOwnProperty? My current semantics would cause it report true for string element indexes. Is this reasonable? It's another set of incompatibilities.

It seems that nobody should be using a for in loop on a String or SV.

String.prototype.trim = ...

Would be an obvious reason. Using hasOwnProperty might be a consideration, but why would anyone want to?

Garrett

# Maciej Stachowiak (17 years ago)

On Jun 24, 2008, at 11:45 PM, Garrett Smith wrote:

Webkit seems to have implemented a specialized [[Put]] for strings. When the property is numeric, the property setting fails silently. This can be observed by an example:-

javascript:(function(){var a = new String("Alan"); a[0] = "E"; alert(a[0]); })();

FF2: "E" Sf3: "A" OP9: "E" IE7: "E"

It seems to me that Webkit's behavior is a bug.

We could easily change this, but why do you think the behavior is a
bug (other than not matching Firefox)? Your blog post did not state a
reason either.

My expectation would be that numeric index properties of a String
object are read-only, much like the length property. For example, the
following gives 3 in both Safari and Firefox:

<script>

var s = new String("abc"); s.length = 55; alert(s.length); </script>

, Maciej

# Allen Wirfs-Brock (17 years ago)

Garrent: Thanks for the pointer to your analysis. Do you have any others that identify issues that could potentially be fixed in ES3.1?

I think in this case I have to agree with Maciej...Webkit appears to be doing the "right thing" by making a string appear to consistently have a set of numerically named readonly properties that exactly correspond to the elements of the string value.

In a clean-slate world, I think that should be the end of the discussion. However, we have backwards compatibility issues to consider. By the book ES3 allows numerically named properties to be added to String objects that are unrelated to the string value, and 2 out of the 3 widely used browser-based implementations that support property style access to the string value also allow such properties to be added. Only Webkit deviates from this. Right or wrong, from a pure compatibility perspective preserving that capability would be important if we think that there is any significant usage of it. The fact that Safari seems to be getting away with its implementation without being badgered into conformance suggests that there probably isn't any such significant usage.

So, unless someone has some evidence that it is going to "break the web" I'm going to leave by ES3.1 specification the way it currently is written, which implements the observed behavior of Webkit.

Maciej: I assume you haven't heard of any significant web content being broken by this behavior. Garrett: Do you know of anything other than your test case that would be impacted if the standard adopted the Webkit behavior?

# liorean (17 years ago)

On 25/06/2008, Allen Wirfs-Brock <Allen.Wirfs-Brock at microsoft.com> wrote:

I think in this case I have to agree with Maciej...Webkit appears to be doing the "right thing" by making a string appear to consistently have a set of numerically named readonly properties that exactly correspond to the elements of the string value.

Well, the way I think about String objects and strings:

  • A string can't have properties of it's own at all, i.e. can be thought of as having a noop setter and a custom getter for indices and length. Any property not found by the getter should get delegated to String.prototype.
  • A String object is just a normal object delegating to a string. (Could probably reuse the [[Prototype]] internal property if implemented that way...) Thus, any properties set are placed on the wrapper object and would as a result of that shadow the getter and setter on the string.

In a clean-slate world, I think that should be the end of the discussion. However, we have backwards compatibility issues to consider. By the book ES3 allows numerically named properties to be added to String objects that are unrelated to the string value, and 2 out of the 3 widely used browser-based implementations that support property style access to the string value also allow such properties to be added. Only Webkit deviates from this. Right or wrong, from a pure compatibility perspective preserving that capability would be important if we think that there is any significant usage of it.

I expect almost all usage of "new String" to be from people who either come from Java or who have not learnt JavaScript beyond the rookie stage yet. And maybe one or two library writers who are too clever for their own good...

The fact that Safari seems to be getting away with its implementation without being badgered into conformance suggests that there probably isn't any such significant usage.

Neither the direct access of characters in the string by indiced property lookup nor the use of String objects is common, so the overlap is in all probability diminutive.

# Brendan Eich (17 years ago)

On Jun 25, 2008, at 9:37 AM, Allen Wirfs-Brock wrote:

I think in this case I have to agree with Maciej...Webkit appears
to be doing the "right thing" by making a string appear to
consistently have a set of numerically named readonly properties
that exactly correspond to the elements of the string value.

IIRC, we intentionally allowed overriding indexed properties to be
set on String objects, for backward compatibility.

So, we chickened out. Minority share browsers facing a rare use-case
conflicting with an extension sometimes do that. It's hard to know
when getting away with something in such a setting means you can
continue to get away clean, never mind whether a new version of the
standard should codify the minority behavior.

In a clean-slate world, I think that should be the end of the
discussion. However, we have backwards compatibility issues to
consider. By the book ES3 allows numerically named properties to
be added to String objects that are unrelated to the string value,
and 2 out of the 3 widely used browser-based implementations that
support property style access to the string value also allow such
properties to be added. Only Webkit deviates from this. Right or
wrong, from a pure compatibility perspective preserving that
capability would be important if we think that there is any
significant usage of it
. The fact that Safari seems to be
getting away with its implementation without being badgered into
conformance suggests that there probably isn't any such significant
usage.

Probably. But we don't know enough.

So, unless someone has some evidence that it is going to "break the
web" I'm going to leave by ES3.1 specification the way it currently
is written, which implements the observed behavior of Webkit.

This isn't the only way to proceed. Changing a pre-release, but
widely used, Firefox build to match Safari would help. Changing IE
beta-next as well would help even more. None of this would be
decisive, but it would trump any assertions we can make today.

Again I am concerned with making an ES3.1 with zero implementations
before the standard is finalized.

# Mike Shaver (17 years ago)

On Wed, Jun 25, 2008 at 3:15 PM, liorean <liorean at gmail.com> wrote:

On 25/06/2008, Allen Wirfs-Brock <Allen.Wirfs-Brock at microsoft.com> wrote:

I think in this case I have to agree with Maciej...Webkit appears to be doing the "right thing" by making a string appear to consistently have a set of numerically named readonly properties that exactly correspond to the elements of the string value.

Well, the way I think about String objects and strings:

  • A string can't have properties of it's own at all, i.e. can be thought of as having a noop setter and a custom getter for indices and length. Any property not found by the getter should get delegated to String.prototype.
  • A String object is just a normal object delegating to a string. (Could probably reuse the [[Prototype]] internal property if implemented that way...) Thus, any properties set are placed on the wrapper object and would as a result of that shadow the getter and setter on the string.

That's not how they work today, though, at least as far as the .length property goes:

js> s = new String("foo")

foo js> s.length

3 js> s.length = 5

5 js> s.length

3 js>

And of course it has to delegate to an object that has all the right methods on it.

Are "numerically-named" properties >= length settable?

Neither the direct access of characters in the string by indiced property lookup nor the use of String objects is common, so the overlap is in all probability diminutive.

Direct use of characters is pretty common, I think, but you sound like you are stating fact rather than opinion, so I'm prepared to be swayed by your data. String objects are very commonly created[*], by people calling methods or accessing properties like .length on string primitives.

[*] we optimize the object creation away there for built-ins, and I believe that WebKit is about to do the same thing, but that doesn't -- can't! -- affect the semantics of object creation, and the relationship to String.prototype.

Mike

# liorean (17 years ago)

On Wed, Jun 25, 2008 at 3:15 PM, liorean <liorean at gmail.com> wrote:

Well, the way I think about String objects and strings:

  • A string can't have properties of it's own at all, i.e. can be thought of as having a noop setter and a custom getter for indices and length. Any property not found by the getter should get delegated to String.prototype.
  • A String object is just a normal object delegating to a string. (Could probably reuse the [[Prototype]] internal property if implemented that way...) Thus, any properties set are placed on the wrapper object and would as a result of that shadow the getter and setter on the string.

That's not how they work today, though, at least as far as the .length property goes:

js> s = new String("foo") foo js> s.length 3 js> s.length = 5 5 js> s.length 3 js>

And of course it has to delegate to an object that has all the right methods on it.

Yes, I had thought of that too, but then I had already pressed the send button...

Are "numerically-named" properties >= length settable?

In saf3.1.1, ff3.0 and op9.50 they are, yes.

Neither the direct access of characters in the string by indiced property lookup nor the use of String objects is common, so the overlap is in all probability diminutive.

Direct use of characters is pretty common, I think, but you sound like you are stating fact rather than opinion, so I'm prepared to be swayed by your data. String objects are very commonly created[*], by people calling methods or accessing properties like .length on string primitives.

I'm not stating that from any collected data, no. It was purely based on three things:

  • The only scripts I have seen using it both happen to be made by advanced coders and happen to not have IE in their target audience for some reason (extension, widget, greasemonkey script, userjs, bookmarklet for some specific browser etc.).
  • IE does not support it so logically it wouldn't be present in very much code on the web.
  • I can't recall ever seeing anybody on a forum or a mailing list actually having a problem related to use of index lookup directly on strings. And I'm sure that if it was common outside the developers that do not have IE in their target audience at all, then there would be people having trouble with it.

Also, while String objects are created as temporaries when looking up a property on a string, how many of those properties or methods actually return a String object? The only situation I can think of off the top of my head where you're going to actually have a String object to deal with in ES3 is if you're extending the String.prototype object with new methods.

[*] we optimize the object creation away there for built-ins, and I believe that WebKit is about to do the same thing, but that doesn't -- can't! -- affect the semantics of object creation, and the relationship to String.prototype.

Yes, I know my way of looking at it is not quite what is happening. But for a script writer, it's a good enough model of thought about the relation between the primitive and compound objects.

# liorean (17 years ago)

On 25/06/2008, liorean <liorean at gmail.com> wrote:

Also, while String objects are created as temporaries when looking up a property on a string, how many of those properties or methods actually return a String object? The only situation I can think of off the top of my head where you're going to actually have a String object to deal with in ES3 is if you're extending the String.prototype object with new methods.

A distinction I was going to make there fell away when I rewrote an awkward formulation...

s/String object to deal with/String object that you haven't explicitly created using "new String" to deal with/

# Maciej Stachowiak (17 years ago)

On Jun 25, 2008, at 9:37 AM, Allen Wirfs-Brock wrote:

Garrent: Thanks for the pointer to your analysis. Do you have any
others that identify issues that could potentially be fixed in ES3.1?

I think in this case I have to agree with Maciej...Webkit appears to
be doing the "right thing" by making a string appear to consistently
have a set of numerically named readonly properties that exactly
correspond to the elements of the string value.

In a clean-slate world, I think that should be the end of the
discussion. However, we have backwards compatibility issues to
consider. By the book ES3 allows numerically named properties to be
added to String objects that are unrelated to the string value, and
2 out of the 3 widely used browser-based implementations that
support property style access to the string value also allow such
properties to be added. Only Webkit deviates from this. Right or
wrong, from a pure compatibility perspective preserving that
capability would be important if we think that there is any
significant usage of it
. The fact that Safari seems to be getting
away with its implementation without being badgered into conformance
suggests that there probably isn't any such significant usage.

So, unless someone has some evidence that it is going to "break the
web" I'm going to leave by ES3.1 specification the way it currently
is written, which implements the observed behavior of Webkit.

Maciej: I assume you haven't heard of any significant web content
being broken by this behavior.

I have not seen any reports of such problems. If it were common to
put random numeric properties on String objects, I expect we would
have had a bug report by now.

, Maciej

# Garrett Smith (17 years ago)

On Wed, Jun 25, 2008 at 1:52 PM, Maciej Stachowiak <mjs at apple.com> wrote:

On Jun 25, 2008, at 9:37 AM, Allen Wirfs-Brock wrote:

Garrent:

It's Garrett, BTW.

I have not seen any reports of such problems. If it were common to put random numeric properties on String objects, I expect we would have had a bug report by now.

Why? Do you are there many Webkit only applications? Do these applications take advantage of string indexing via property access?

# Garrett Smith (17 years ago)

On Wed, Jun 25, 2008 at 1:49 PM, Maciej Stachowiak <mjs at apple.com> wrote:

On Jun 25, 2008, at 1:33 PM, Garrett Smith wrote:

On Wed, Jun 25, 2008 at 1:25 AM, Maciej Stachowiak <mjs at apple.com> wrote:

On Jun 24, 2008, at 11:45 PM, Garrett Smith wrote:

(putting this back on the list; it contains nothing personal).

None of this explains why you think the WebKit behavior is a bug.

It seems like a bug because it prevents property assignment to String objects. I also said something along the lines of: "I'm not overly concerned with that."

You seem to be hung up on "specialized [[Put]]" but I don't understand what that has to do with anything.

I was trying to answer the OP (Allen's) question: "Should the assignment, s[1] = "x", still be valid in the present if we support string indexing?" Which is a natural way of asking about [[Put]]/ [[CanPut]].

ISTM the heart of the problem is to allow special [[Get]] access to characters in the String; that the problem itself has nothing to do with [[Put]]. However, Allen asked a question with real world implementation. Webkit's behavior, to me, appeared to have implemented a modified [[Put]]. I'm trying to understand what Webkit does and it seems related to the Allen's question:

"Should the assignment, s[1] = "x", still be valid in the present if we support string indexing?"

Where what webkit does has different results of what mozilla and Opera do.

In spec terms, WebKit's behavior can be explained in terms of strings having additional DontDelete ReadOnly properties.

Let me get this straight:

Webkit's behavior can be explained in terms of String objects having additional properties with numeric names and the attributes {DontDelete ReadOnly}

Is that what you meant?

The Mozilla behavior can be explained as strings having those same additional properties, but they are not ReadOnly. In both cases, index properties past the last character do not exist ahead of time.

My observations indicate otherwise. Webkit does not appear to create additional properties to String objects.

javascript:alert(Object("foo").hasOwnProperty(0));

FF2 - true Sf3 - false Op9 - false

Where does the "0" property exist, Maciej? Is this bug related to hasOwnProperty?

It appears to me that Mozilla and Opera and Webkit all implement a specialized [[Get]], where Opera and Mozilla do:

  1. Look for property P on String object.
  2. Look for String instance charAt( P )
  3. Look in String prototype.

Webkit does:-

  1. Look for String instance charAt( P )
  2. Call the[[Get]] method on S with argument P.

javascript:var f = Object("1234567");void(String.prototype[0] = "0"); void(f[0] = "8"); alert(f[0]);

"8" in Opera9 and FF2. "0" in Saf3.

In Opera, the object doesn't have numeric properties, and only appears to have special [[Get]]:- javascript:alert("0" in Object("foo")); javascript:alert(Object("foo")[0]);

Op9 - false and "f" FF2 - true and "f"

Mozilla has the properties on the object and Opera doesn't.

(this explains why - Object("foo").hasOwnProperty(0) - is false in Opera.

The reason for the way WebKit does things, for what it's worth, is because index properties of the string are checked first before normal properties (because they can't be overwritten), so "abc"[1] can be as fast as an array access instead of going at the speed of normal property access.

So the [[Put]] method on a String instance is different in Webkit.

Garrett

# Maciej Stachowiak (17 years ago)

On Jun 25, 2008, at 4:00 PM, Garrett Smith wrote:

In spec terms, WebKit's behavior can be explained in terms of strings having additional DontDelete ReadOnly properties.

Let me get this straight:

Webkit's behavior can be explained in terms of String objects having additional properties with numeric names and the attributes {DontDelete ReadOnly}

Is that what you meant?

Yes.

The Mozilla behavior can be explained as strings having those same
additional properties, but they are not ReadOnly. In both cases, index
properties past the last character do not exist ahead of time.

My observations indicate otherwise. Webkit does not appear to create additional properties to String objects.

javascript:alert(Object("foo").hasOwnProperty(0));

FF2 - true Sf3 - false Op9 - false

Where does the "0" property exist, Maciej? Is this bug related to hasOwnProperty?

I just tried this in Safari 3.1 and Safari alerted true. The same
happens in WebKit trunk. If it alerted false I would say that is a bug.

It appears to me that Mozilla and Opera and Webkit all implement a specialized [[Get]], where Opera and Mozilla do:

  1. Look for property P on String object.
  2. Look for String instance charAt( P )
  3. Look in String prototype.

Webkit does:-

  1. Look for String instance charAt( P )
  2. Call the[[Get]] method on S with argument P.

You could model it in many ways. I have not looked at Mozilla's or
Opera's actual implementations. What I am saying is that Safari/WebKit
tries to publicly present the logical model that these are ReadOnly
DontDelete properties. How it's actually implemented isn't really
relevant. In WebKit's implementation we implement all sorts of JS
properties internally in ways other than putting them in the generic
object property map.

It is true that in spec-speak you could define it as special [[Get]]
and [[Put]] behavior (and other operations like [[Delete]] and
[[HasOwnProperty]]) instead of special properties.

javascript:var f = Object("1234567");void(String.prototype[0] = "0"); void(f[0] = "8"); alert(f[0]);

"8" in Opera9 and FF2. "0" in Saf3.

In Opera, the object doesn't have numeric properties, and only appears to have special [[Get]]:- javascript:alert("0" in Object("foo")); javascript:alert(Object("foo")[0]);

Op9 - false and "f" FF2 - true and "f"

Mozilla has the properties on the object and Opera doesn't.

(this explains why - Object("foo").hasOwnProperty(0) - is false in
Opera.

The reason for the way WebKit does things, for what it's worth, is
because index properties of the string are checked first before normal
properties (because they can't be overwritten), so "abc"[1] can be as fast as
an array access instead of going at the speed of normal property access.

So the [[Put]] method on a String instance is different in Webkit.

What I am talking about above is the equivalent of the spec's [[Get]],
not [[Put]]. The specialization I describe is for performance, and
behaviorally transparent. However, our JavaScript implementation
doesn't have things that correspond exactly to the spec's [[Get]] and
[[Put]] formalisms.

, Maciej

# Maciej Stachowiak (17 years ago)

On Jun 25, 2008, at 2:33 PM, Garrett Smith wrote:

On Wed, Jun 25, 2008 at 1:52 PM, Maciej Stachowiak <mjs at apple.com>
wrote:

I have not seen any reports of such problems. If it were common to
put random numeric properties on String objects, I expect we would have
had a bug report by now.

Why?

What I meant is this:

  1. When Safari/WebKit/JavaScriptCore diverges from other browsers in
    JavaScript behavior, in ways that Web content depends on, we have
    historically gotten bug reports even when the issue is very obscure.
    See my earlier comments about things like function declarations in
    statement position for examples.

  2. We have not gotten bug reports that involve a site breaking because
    it set a low numeric property on a String object, and did not get the
    expected value back. At least, none have been found to have this as
    the cause. In other words, we have not seen cases like this:

var s = new String("abc"); s[0] = "expected"; if (s[0] != "expected") alert("EPIC FAIL");

  1. Therefore, I think it is unlikely that a lot of public Web content
    depends on being able to do this. If this were at all common, odds are
    that we would have heard about it by now. As Brendan suggests,
    deploying the behavior in beta versions of other browsers would give
    us more data points.

Do you are there many Webkit only applications? Do these applications take advantage of string indexing via property access?

I do not think the existence of WebKit-only applications is relevant.
There are in fact a fair number (for example Dashboard widgets and
iPhone-specific Web apps), but they do not tell us anything about
whether public Web content at large depends on the behavior of
allowing any numeric property of a String object to be successfully
assigned. (I do not think any of this content depends on the WebKit
behavior either).

, Maciej