Unique Public Symbols as Strings

# Erik Arvidsson (12 years ago)

gist.github.com/arv/0bbb184710016e00d56c

The main goal of this proposal is to let us postpone the discussion about private state until ES7, making sure that we solve the main use cases.

# Axel Rauschmayer (12 years ago)

More of an aside: I think it would help if we had a list of what people actually want from privacy.

I want:

  1. Avoiding name clashes

  2. Indicate that a property is not part of the public API of an object (along with support from an IDE and a reflective API)

Other people seem to want:

  1. Completely protecting data from external access.

Currently, symbols take care of #1. They don’t really take care of #2, because we probably want some symbols to be part of the public API of an object.

Another possibility to support #2 (but not #1): a naming convention for properties.

#3 seems to be well covered by either closures or weak maps.

# Brendan Eich (12 years ago)

Axel Rauschmayer wrote:

More of an aside: I think it would help if we had a list of what people actually want from privacy.

I want:

  1. Avoiding name clashes

  2. Indicate that a property is not part of the public API of an object (along with support from an IDE and a reflective API)

Other people seem to want:

  1. Completely protecting data from external access.

3's "external access" must include reflection (proxies).

You left out

  1. Convenient syntax, e.g. obj@priv from relationships, and not obj[privSymValue].

Currently, symbols take care of #1. They don’t really take care of #2, because we probably want some symbols to be part of the public API of an object.

We definitely do: @@iterator in the ES6 draft, for example.

Another possibility to support #2 (but not #1): a naming convention for properties.

We aren't doing more dunder names.

#3 seems to be well covered by either closures or weak maps.

No, see 4 and also consider that we must not require a weak map, since per relationships those are mutable so side channels, and for optimization purposes we don't even want to create a weak map even if one could suffice, where the private fields are class-affine and live longer than all instances. Symbols (unique names that can't be forged and which do not leak via reflection) under the hood work much better there.

Arv's proposal is about symbol though, not private state in classes. Note all its class examples use symbols that are unique and quite public. The idea is to clear the decks for private state as a separate proposal (see again strawman:relationships) from symbols, and do away with private vs. unique symbols.

# Brandon Benvie (12 years ago)

On 7/25/2013 1:31 PM, Erik Arvidsson wrote:

gist.github.com/arv/0bbb184710016e00d56c

The main goal of this proposal is to let us postpone the discussion about private state until ES7, making sure that we solve the main use cases.

Having those properties be enumerable is very undesirable, kind of a deal breaker in my opinion. If there was a way to make them non-enumerable, then I could see using these instead of Symbols (though Symbols would still be preferable).

# David Bruant (12 years ago)

Le 25/07/2013 22:31, Erik Arvidsson a écrit :

gist.github.com/arv/0bbb184710016e00d56c

The main goal of this proposal is to let us postpone the discussion about private state until ES7, making sure that we solve the main use cases.

I'm not sure I understand how this proposal lets TC39 postpone the private state discussion. I may be lacking some context?

This is something that authors can do today. If that's what they wanted, they'd be doing it already, wouldn't they?

# Brandon Benvie (12 years ago)

On 7/25/2013 1:31 PM, Erik Arvidsson wrote:

gist.github.com/arv/0bbb184710016e00d56c

The main goal of this proposal is to let us postpone the discussion about private state until ES7, making sure that we solve the main use cases.

Differences from Symbols:

  • enumerable
  • visible to Object.getOwnPropertyNames
  • no way to differentiate a unique string from any other string
  • no easy debug representation (since you can't differentiate from a string)

I'm curious why the desire to not have unique symbols? I was under the impression that private Symbols or relationships or whatever could be punted without affecting unique Symbols, which serve a different use case (name spacing and encapsulation, not security). This proposal covers name spacing while ignoring encapsulation.

# Erik Arvidsson (12 years ago)

It is not clear what private state will look like. The relationship work Mark has done looks promising but it is not yet clear that we need both private state and unique symbols.

The intent of this proposal was to open the door for alternatives, that can be used in ES6 and then make sure we get the whole thing right in ES7 (which should be following shortly after ES6)

# Brandon Benvie (12 years ago)

On 7/25/2013 5:31 PM, Erik Arvidsson wrote:

It is not clear what private state will look like. The relationship work Mark has done looks promising but it is not yet clear that we need both private state and unique symbols.

Right, what I'm saying is that unique Symbols and private state are orthogonal and punting on private state need not prevent unique Symbols.

# Erik Arvidsson (12 years ago)

On Jul 25, 2013 3:30 PM, "Brandon Benvie" <bbenvie at mozilla.com> wrote:

On 7/25/2013 1:31 PM, Erik Arvidsson wrote:

gist.github.com/arv/0bbb184710016e00d56c

The main goal of this proposal is to let us postpone the discussion about private state until ES7, making sure that we solve the main use cases.

Differences from Symbols:

  • enumerable

This is the only one I feel is relevant.

  • visible to Object.getOwnPropertyNames

Symbols are also visible (using getOwnPropertyKeys)

  • no way to differentiate a unique string from any other string

There is no difference.

You could use a RegExp if you care (you should not care).

  • no easy debug representation (since you can't differentiate from a string)

Just print the string. Using symbols you would need to use symbol.name.

For debuggers and dev tools, they can always special case. But that applies to both.

# Kevin Smith (12 years ago)

The defining feature of symbols is that they are unguessable, but this feature is useless in the context of "unique symbols", since one can always just get to the symbol by property inspection. As such, symbols appear to provide no benefit over strings for this use case.

In fact, unique strings provide a useful feature that symbols do not: they need not be associated with any runtime execution context. We can define a well-known unique string and share it across realm boundaries. We can also share these keys across version boundaries, such that many versions of a library can recognize the same key.

# Mark S. Miller (12 years ago)

And between machines! Unique-ish strings as method names work just fine in remote messaging systems without any new special case. Sending unique symbols as method names would require special delicate logic at each end, in order to get the correspondence right. For user-defined (i.e., non-platform-defined) method names, it is not clear that there even is a way to handle this correspondence reliably.

# Brandon Benvie (12 years ago)

On Jul 29, 2013, at 5:26 AM, Kevin Smith <zenparsing at gmail.com> wrote:

The defining feature of symbols is that they are unguessable, but this feature is useless in the context of "unique symbols", since one can always just get to the symbol by property inspection. As such, symbols appear to provide no benefit over strings for this use case.

The defining feature of symbols is that they are default-encapsulated. As in:

{ [sym]: value }
// or
obj[sym] = value

does not make an object with properties that leak out to any other non-meta-level code. That is, all the commonly used MOP interactions are oblivious to its existence. If you want to use a very specialized function that does exposé it, then you're almost certainly doing it serious introspective reasons, like debugging or something.

{ [stringSym]: value }
// or
obj[stringSym] = value

leaks to Object.keys and for-in, which are very commonly used operations and will get used in conjunction with getting and setting. That destroys the encapsulation component of Symbols, which I argue is one of (if not the primary) benefit of Symbols.

In fact, unique strings provide a useful feature that symbols do not: they need not be associated with any runtime execution context. We can define a well-known unique string and share it across realm boundaries. We can also share these keys across version boundaries, such that many versions of a library can recognize the same key.

If this benefit were desired then it's possible to make it so there's a way to construct a symbol using a given GUID such that the same Symbol would be returned no matter where you asked for it from. With caveats, it could be implemented in library code (though it would be preferential for it to be specified and commonly available).

Symbol.fromGUID(...)
# Kevin Smith (12 years ago)
{ [stringSym]: value }
// or
obj[stringSym] = value

leaks to Object.keys and for-in, which are very commonly used operations and will get used in conjunction with getting and setting. That destroys the encapsulation component of Symbols, which I argue is one of (if not the primary) benefit of Symbols.

Are you arguing that non-enumerability is a useful encapsulation mechanism? I'm skeptical. The main use case that I've seen for for-in and Object.keys is using a plain object as a map. This use case is addressed more cleanly now with Map and for-of, is it not?

Furthermore, developers today generally use underscores to "hide" internal properties and methods. I'm not aware that it is a common practice to bother with making these properties non-enumerable. Is it?

# Brandon Benvie (12 years ago)

On 7/29/2013 10:03 AM, Kevin Smith wrote:

Are you arguing that non-enumerability is a useful encapsulation mechanism?

I believe enumerability has a propensity to get in the way of encapsulation.

The main use case that I've seen for for-in and Object.keys is using a plain object as a map. This use case is addressed more cleanly now with Map and for-of, is it not?

It is safer to use a Map but it is not more convenient. I expect to continue seeing people using object literals and mixing properties for things like option objects for a long time to come.

Furthermore, developers today generally use underscores to "hide" internal properties and methods. I'm not aware that it is a common practice to bother with making these properties non-enumerable. Is it?

Right, and using GUIDs is just slightly better than this. The only thing it does it provide uniqueness, who's value is mainly in inheritance to prevent collisions in private fields. My experience is that inheritance is usually done shallowly and these collisions are rare and obvious when they happen.

# Brendan Eich (12 years ago)

Brandon Benvie wrote:

On 7/29/2013 10:03 AM, Kevin Smith wrote:

Are you arguing that non-enumerability is a useful encapsulation mechanism?

I believe enumerability has a propensity to get in the way of encapsulation.

Yes, because (among others) for-in loops that don't expect some GUID-named thing will be unpleasantly surprised.

The main use case that I've seen for for-in and Object.keys is using a plain object as a map. This use case is addressed more cleanly now with Map and for-of, is it not?

It is safer to use a Map but it is not more convenient. I expect to continue seeing people using object literals and mixing properties for things like option objects for a long time to come.

Absolutely, and there's nothing wrong with that :-P. We shouldn't insist on Map literals without lots of evidence of object literals gone wrong. No such evidence in front of us, as far as I can tell.

Furthermore, developers today generally use underscores to "hide" internal properties and methods. I'm not aware that it is a common practice to bother with making these properties non-enumerable. Is it?

Right, and using GUIDs is just slightly better than this. The only thing it does it provide uniqueness, who's value is mainly in inheritance to prevent collisions in private fields. My experience is that inheritance is usually done shallowly and these collisions are rare and obvious when they happen.

Agreed.

GUIDs are strangely attractive but ugly. If we had a way to bind pretty names to them, in the proposed extension, then it wouldn't just be "up to debuggers". People console.log all the time.

# Juan Ignacio Dopazo (12 years ago)

2013/7/29 Brandon Benvie <bbenvie at mozilla.com>

My experience is that inheritance is usually done shallowly and these collisions are rare and obvious when they happen.

It is not obvious when using mixins in UI code. Method names like "_onWindowResize" are common and can easily lead to conflicts.

# Kevin Smith (12 years ago)

It is safer to use a Map but it is not more convenient. I expect to continue seeing people using object literals and mixing properties for things like option objects for a long time to come.

Sure, but with objects-as-maps, you're not interested in "encapsulating" the keys. It's moot.

Right, and using GUIDs is just slightly better than this. The only thing it does it provide uniqueness, who's value is mainly in inheritance to prevent collisions in private fields. My experience is that inheritance is usually done shallowly and these collisions are rare and obvious when they happen.

I think you're mischaracterizing the point of uniqueness. The value of uniqueness lies in the fact that you can design protocols without having to globally coordinate on property or method names. (E.g. iterator)

# Brandon Benvie (12 years ago)

On 7/29/2013 11:33 AM, Kevin Smith wrote:

The value of uniqueness lies in the fact that you can design protocols without having to globally coordinate on property or method names. (E.g. iterator)

I think {keys, maps, values} (and the pivot to using Symbols and functions for them) demonstrates that a protocol is simply an agreed upon way of communicating; there's no requirement for uniqueness. Uniqueness has value in that it prevents accidental collisions. Symbols provide uniqueness because they are compared by identity, GUIDs provide uniqueness by having a value that is unlikely to be collided with.

# Brendan Eich (12 years ago)

Brandon Benvie wrote:

On 7/29/2013 11:33 AM, Kevin Smith wrote:

The value of uniqueness lies in the fact that you can design protocols without having to globally coordinate on property or method names. (E.g. iterator)

I think {keys, maps, values} (and the pivot to using Symbols and functions for them)

BTW that was just one thought (from me), and TC39 went another way to avoid the 'values'/with incompatibility found via Ext.js. (See the meeting notes for the first day, about the @unscopeable list.)

demonstrates that a protocol is simply an agreed upon way of communicating; there's no requirement for uniqueness. Uniqueness has value in that it prevents accidental collisions. Symbols provide uniqueness because they are compared by identity, GUIDs provide uniqueness by having a value that is unlikely to be collided with.

Agreed. There have been GUID collisions in the field with things like COM IIDs. Rare but possible (spoofed MAC address?). This is not a fatal flaw, especially if TC39 itself specifies well-known GUIDs, e.g., for 'iterator'. But the lack of a human-readable name is a real drawback IMHO.

# Brandon Benvie (12 years ago)

On 7/29/2013 1:09 PM, Brendan Eich wrote:

Brandon Benvie wrote:

On 7/29/2013 11:33 AM, Kevin Smith wrote:

The value of uniqueness lies in the fact that you can design protocols without having to globally coordinate on property or method names. (E.g. iterator)

I think {keys, maps, values} (and the pivot to using Symbols and functions for them)

BTW that was just one thought (from me), and TC39 went another way to avoid the 'values'/with incompatibility found via Ext.js. (See the meeting notes for the first day, about the @unscopeable list.)

Ah right, forgot about that. Another example (in the wild) is SpiderMonkey's use of "iterator" for the iteration protocol.

# Brendan Eich (12 years ago)

Brandon Benvie wrote:

Ah right, forgot about that. Another example (in the wild) is SpiderMonkey's use of "iterator" for the iteration protocol.

In ES4 days it was __iterator__, after dunder-iter in Python.

More recently, Jason implemented 'iterator' for two reasons, I think: 1, lack of symbol spec of implementation as prerequisite; 2, belief that a public name was better. Jason argued that case here, but I don't think he prevailed (Dean Landolt disagreed).

We aren't putting this out "in the wild" for use by anyone, it's just a temporary state, which depends on symbol spec resolution for ES6 -- this thread's topic!

# Brandon Benvie (12 years ago)

Right, I should have been explicit. It's definitely a temporary measure. Similarly, V8 does some non-spec stuff to make for-of work (I think it just assumes the object is already an iterator). I just meant it as an example of a protocol in action (even if as a temporary stopgap).

# Dean Landolt (12 years ago)

On Mon, Jul 29, 2013 at 4:29 PM, Brendan Eich <brendan at mozilla.com> wrote:

More recently, Jason implemented 'iterator' for two reasons, I think: 1, lack of symbol spec of implementation as prerequisite; 2, belief that a public name was better. Jason argued that case here, but I don't think he prevailed (Dean Landolt disagreed).

FWIW Jason convinced me in the end -- I was subtly misinterpreting the spec. I still believe symbols (or something like them) are really important, just not necessarily for iterators.

I'd also like to echo the sentiment in favor of private symbols. Unique symbols really don't offer much over GUIDs, and don't make a whole lot of sense in a world without private symbols. And in a world with private symbols unique symbols aren't strictly necessary.

I don't fully grok the relationships strawman yet but it looks really promising. I wonder what a maximally minimal version of it might look like -- if it could be stripped down enough to just accommodate the needs of the es6 spec. while remaining palatable and leaving the door to private symbols open? Anything to avoid GUIDs. I'd bet most everyone would concede they're are a smell, an es-regret waiting to happen :)

# Brendan Eich (12 years ago)

Dean Landolt wrote:

FWIW Jason convinced me in the end -- I was subtly misinterpreting the spec.

Oh, my apologies!

Allen, what say you? We should resolve this ASAP since engines are implementing ES6 for-of and iterators. Cc'ing Andy too.

I still believe symbols (or something like them) are really important, just not necessarily for iterators.

Agreed.

I'd also like to echo the sentiment in favor of private symbols. Unique symbols really don't offer much over GUIDs, and don't make a whole lot of sense in a world without private symbols. And in a world with private symbols unique symbols aren't strictly necessary.

Agreed again. Great to see convergence in people's thinking.

I don't fully grok the relationships strawman yet but it looks really promising. I wonder what a /maximally minimal/ version of it might look like -- if it could be stripped down enough to just accommodate the needs of the es6 spec. while remaining palatable and leaving the door to private symbols open? Anything to avoid GUIDs. I'd bet most everyone would concede they're are a smell, an es-regret waiting to happen :)

Cc'ing Mark. If I understand correctly, the issue with relationships is the difference between weakmap as R and symbol as R. A weakmap is a mutable object so a side channel, a symbol is immutable (all the way down, which is not far because symbols are shallow).

# Mark S. Miller (12 years ago)

Yes. There are two opposite use cases. Both are necessary.

Use case for something like unique symbols / public symbols / guids / funny-looking strings:

Given that base at r = v succeeds at mutating something, we account for the mutable state as belonging to base. This allows r to be transitively immutable, and so sharable between subsystems that should not be able to communicate. All the unique symbols mentioned in the ES6 spec itself are of this form. Clearly, everyone (even across realms) must mean the same thing by @iterator, and so @iterator should not be mutable.

When doing the operation across a membrane, where let's say the original of all three objects are on the other side of the membrane, it should be the proxy for the base object which traps the operation. Ideally, r should not be proxies, but should pass through the membrane in both directions untranslated.

By trapping at the base, a base proxy which did not know r is able to obtain r in its trap handler.

Use case for something like private symbols / weak maps / fields:

Given that base at r = v succeeds at mutating something, we account for the mutable state as belonging to r. This allows base to be transitively immutable, and so sharable between subsystems that should not be able to communicate. (Note that, although we account for the mutable state as belonging to r in the semantics, the implementation should always store the actual mutable state in the storage record implementing the base object, just as it would do for an internal property.)

When doing the operation across a membrane, where let's say the original of all three objects are on the other side of the membrane, it should be the proxy for the r object which traps the operation.

By trapping at r, an r proxy which did not know the base is able to obtain the base in its trap handler.