Agreeing on user-defined unique symbols?
We have exactly the same problem with inter-realm instanceof.
The module system is the obvious solution for system-defined unique symbols. I've been assuming it could be hacked (one way or another) into giving us a registry where we could map string keys to symbols across realms. I get that this hack does nothing but bifurcates the namespace into plain strings and symbol strings. But for some cases this is a fine solution -- seems to work fine for ruby and smalltalk.
Maybe this would even merit syntax or some other language level treatment. If user-defined symbols come to pass (GUID-backed or otherwise) I'm sure at least one version of this kind of string-backed registry hack will pop up. Is it better to get ahead of it and limit it to only one by prescribing a solution (in spec. or elsewhere)?
One way could be to have a shared module with the symbol that both w1.1 and w1.2 uses.
// w-symbols.js
export let foo = Symbol();
// w-1.1.js
import {foo} from 'w-symbols';
...
// w-1.2.js
import {foo} from 'w-symbols';
...
Other options include storing the symbol in some kind of global registry (which the module registry is doing above).
Where does the w-symbols module come from? Remember that initially w1.1 is loaded into realm A without knowledge of realm B and vice versa. Let's also say, to emphasize the point, that they're also under disjoint module loaders, so at the time they're initialized they can't use a common module loader as a rendezvous point for new definitions.
Other options include storing the symbol in some kind of global registry (which the module registry is doing above).
I know of only two such notions of a global registry:
- One that would also serve as a global communications channel, which is therefore disqualified
- The equivalent of an interning table, as in the symbol tables of smalltalk that Dean mentions.
An interning table from JS strings to unique symbols would have the property that
intern(str1) === intern(str2) iff str1 === str2
The cool thing about an such interning table is that it is global mutable state that does not provide a global communications channel.
To avoid accidental collision on the interned symbols, you must avoid accidental collision on the strings used as keys in this registration table. This demands exactly as much collision resistant of string choices as using the strings directly. And therefore also demands strings which are just as ugly.
On Wed, Jul 31, 2013 at 9:27 AM, Mark Miller <erights at gmail.com> wrote:
To avoid accidental collision on the interned symbols, you must avoid accidental collision on the strings used as keys in this registration table. This demands exactly as much collision resistant of string choices as using the strings directly. And therefore also demands strings which are just as ugly.
Theoretically, yes. I think that in practice, having the symbol namespace be separate from the string namespace is sufficient to avoid the vast majority of conflict issues. Colliding between symbols is much more difficult to do, and it's pretty easy to inject just a little bit of entropy into the string via a prefix or something.
This is especially nice if we make language-defined symbols "fake it", more or less by living in a third undetectable namespace, so we avoid even the possibility of a language-defined symbol colliding.
We'd then have three property namespaces:
- Strings, which are convenient and easy to use, but also used by everything, so collision potential is moderate.
- Symbol strings, which are less convenient, but also less necessary, so they'll be less things to collide with (and you can give one slightly long prefixed name for the actual symbol string, and a much shorter name to the (global?) variable holding the symbol object).
- UA-defined symbol strings, which are identical to normal symbol strings but live in a third namespace to avoid any possibility of confusion.
To avoid accidental collision on the interned symbols, you must avoid accidental collision on the strings used as keys in this registration table. This demands exactly as much collision resistant of string choices as using the strings directly. And therefore also demands strings which are just as ugly.
I agree. A unique string to symbol registry would be a useless indirection, as there is no information or ability stored in the symbol that is not already inherent in the string used to fetch the symbol.
The solution to your version+realm problem of this post is trivial with string names. Just use a well-known unique string (uuid or otherwise).
unique string (uuid or otherwise) suffer same problem if generated in 2 different realm unaware of each other.
In any case, as mentioned before, this is very similar to instanceof or isPrototypeOf/getPrototypeOf and I don't see many concrete real cases problems except being able to serialize and deserialize @symbols in a unique way across realms (symbol name within a specific module scope mapped as unique or anything that could work I cannot think about now ^_^)
Seems like we are yet again talking ourselves out of unique symbols
unique string (uuid or otherwise) suffer same problem if generated in 2 different realm unaware of each other.
I meant hard-coding such a well-known unique string into both versions of the library. That solves the problem.
On Wed, Jul 31, 2013 at 12:27 PM, Mark Miller <erights at gmail.com> wrote:
To avoid accidental collision on the interned symbols, you must avoid accidental collision on the strings used as keys in this registration table. This demands exactly as much collision resistant of string choices as using the strings directly. And therefore also demands strings which are just as ugly.
Not exactly -- it's a fresh namespace, so no legacy collisions are possible. Of it immediately becomes a landgrab, similar to npm package names -- or even better, identical to global object today. This is a social problem, and the incentives to play fair have proven effective.
This registry would have the secondary effect of functioning as a universal concept registry, e.g. the IANA link relation types.
Andrea Giammarchi wrote:
unique string (uuid or otherwise) suffer same problem if generated in 2 different realm unaware of each other.
Mark's point about UUIDs generated by robust RBGs (Random Bit Generators) addresses the collision risk: it is astronomically small.
In any case, as mentioned before, this is very similar to instanceof or isPrototypeOf/getPrototypeOf and I don't see many concrete real cases problems except being able to serialize and deserialize @symbols in a unique way across realms (symbol name within a specific module scope mapped as unique or anything that could work I cannot think about now ^_^)
This serialize point is good, and gets to something I raised with Allen yesterday, which he originally stated: realms should be more isolated. We may want distribuited (cross-machine) realms. IE and other browsers may already do cross-process window.open, returning a DCOM proxy or some such. SpiderMonkey in Firefox uses membranes across all realm/global boundaries, even same-origin.
Given this, I think instanceof (or typeof with an extension that must be realm-specific because implemented by user-code, or else we have to reify the "world of realms" as discussed in the other thread) breaking cross-realm could be addressed by saying "proxy harder". IOW let's not complicate JS with realms, worlds, truly global string to symbol registries, etc. etc.
Let's get back to the simplicity of same-realm as the normal case, with few-and-legacy exceptions. Can we do it?
for what is worth it, this worked quite well in a single realm with no enumerability though: gist.github.com/WebReflection/5238782#file-gistfile1-js-L1
// example
var sym = new Symbol;
var o = {};
o[sym] = 123;
console.log(o[sym]);
On Wed, Jul 31, 2013 at 1:50 PM, Kevin Smith <zenparsing at gmail.com> wrote:
I meant hard-coding such a well-known unique string into both versions of the library. That solves the problem.
And the next step? Someone will hash a well-known string to get their well-known "unique symbol" and write a blog post calling it a best practice :)
org.ecmascript.es6.builtins.iterator?
You forgot the smiley, or: nooooooooooo!!!!!!
On Wed, Jul 31, 2013 at 10:56 AM, Brendan Eich <brendan at mozilla.com> wrote:
Domenic Denicola wrote:
org.ecmascript.es6.builtins.iterator?
You forgot the smiley, or: nooooooooooo!!!!!!
Which is why I (not in jest) suggested the third property namespace, for language-defined symbols. ^_^
Why a third namespace when there's the built-in modules that functions nicely as our registry?
On Wed, Jul 31, 2013 at 11:34 AM, Dean Landolt <dean at deanlandolt.com> wrote:
Why a third namespace when there's the built-in modules that functions nicely as our registry?
This suggestion was in the context of "symbol strings", which are just strings in a separate namespace, rather than the existing concept of symbols as a unique, empty, frozen object. With a symbol string, where you retrieve it from doesn't matter - equality is still based on the contents. If you want guaranteed non-conflict with user-space symbols, you need a third namespace.
You say either and I say either,
You say neither and I say neither
Either, either neither, neither
Let's call the whole thing off.
You like potato and I like potahto
You like tomato and I like tomahto
Potato, potahto, tomato, tomahto.
Let's call the whole thing off
-- Ira Gershwin
I feel like calling off symbols when we have string-keyed interning tables and three namespaces and bears, oh my.
You didn't quote enough of the song! We have to get married in the end eventually.
You sweet talker! But actually, you need Fred and Ginger dancing to get me to that church on time.
I'm going to concede you have a path-dependent point, but spend some time on my "Realm, schmealm!" thread in hope of a simpler future.
I'm going to concede you have a path-dependent point, but spend some time on my "Realm, schmealm!" thread in hope of a simpler future.
If something interesting comes out of there, great!
In the meantime, though the magic words "three namespaces" sounds really complicated, it's not. ^_^ At least, it's less complicated than symbols - you could implement it by adding two bits to Strings, one to track whether they're a symbol string, and one to track whether they're UA-created (and thus unforgeable, because user code can't set that bit). Messing with properties then depends on string contents + the two bits, rather than just contents as today.
Or you could do a new object that's a wrapper around strings, if those two bits would be annoying to add to all strings. But still, it's a really simple thing, because there's no special lookup or registry or anything like "namespace" usually implies. They're just two more sets of strings prevented from colliding with each other or "normal" strings.
Also, note that the interning table that Mark mentions isn't required at all, unless you want the possibility of real symbols (and just allow collision in the "registry"). Even then, you can avoid this by producing either a "real symbol" (guaranteed unique) or a "symbol string", based on whether it's constructed directly or obtained from the interning process. The "registry" would in this scenario just be a sham, a convenient fiction we use to make people think of coordination.
I still see big usability problems with UUIDs, even if used to name non-enumerable properties. Tools help but the core language provides no sugar, salt, or paprika. Just a very sour/bitter hex-string...
I agree.
There are only three different approaches to collision avoidance that I'm aware of:
-
"Noise": Use enough randomness such that the chance of collision is effectively zero for practical purposes. (ex. uuids)
-
"The Registry": A central authority determines who can use each name. (ex. DNS)
-
"Anarchy": Rely on participants to organically manage the namespace. (ex. the global object)
-
Noise is not appropriate when the name is going to be handled by humans. At the very least, our property names will be handled by humans during the debugging process.
-
DNS-based names are universally despised.
-
Anarchy is not appropriate when the number of names which must be simultaneously distinguishable passes some threshold. The longer the names, the higher that threshold will be. In this case, the set of names which must be simultaneously distinguishable is the set of property names used within some loosely-defined runtime context.
Because we only need to worry about the set of property names used within a runtime context, and not the complete universe of property names, we might be able get away with anarchy, so long as we choose sufficiently long property names. One possibility:
class C {
"sys/iterator"() { }
"foo/userDefined"() { }
}
In the future, syntactic sugar could be added for declaring lexically bound property name aliases. One possibility:
alias { "sys/iterator" as iterator, "foo/userDefined" as userDefined };
class C {
iterator() { }
userDefined() { }
}
Note that this approach allows multiple modules to recognize the same "strong" property name without requiring a dependency on a common module. In other words, modules can coordinate on strong property names without needing to be connected.
I like this. Cc'ing Dave and Sam to get to top of their queues.
On Fri, Aug 2, 2013 at 12:45 PM, Kevin Smith <zenparsing at gmail.com> wrote:
class C { "sys/iterator"() { } "foo/userDefined"() { } }
How do you distinguish these from plain strings? Or are you not distinguishing them, and just assuming that we add a way to use string literals as method names?
On 8/2/2013 1:21 PM, Tab Atkins Jr. wrote:
How do you distinguish these from plain strings? Or are you not distinguishing them, and just assuming that we add a way to use string literals as method names?
I think he's saying they would just be plain strings. With current plans, you'd have to do
class C {
["sys/iterator"]() {}
}
But I could see allowing string literals in property names.
You can already use string literals as property names:
class C {
"sys/iterator"() {}
}
Oh right, method syntax threw me off, but this is completely valid.
\o/
One think I have suggested back in a time may has an interesting property to a private symbol sharing. What if private symbols were of "function" type instead of "object", where given a symbol 'foo'
foo(object, arg1, arg2, …)
desugars to
object[foo](arg1, arg2, …)
That would make symbol sharing as legit as sharing regular functions. In addition it would allow users to define / consume them in functional or OOP style depending on their preferences.
This point is important enough that I'm resending as the start of a new thread.
On Wed, Jul 31, 2013 at 7:50 AM, Mark S. Miller <erights at google.com> wrote: