Ducks, Rabbits, and Privacy
Why can we express in a property descriptor the notion of writable, configurable or enumerable but not private?
Also, could be off topic, but the fact that for a getter/setter foo property, you have to implement yourself a non-enumerable _foo property to actually have some storage, is not particularly convenient. A solution to that would be welcome! If a local variable following the name of the property was added to the scope of the getter/setter while it's called on an object could be one way, it would certainly encourage following encapsulation rather than accessing a private property directly, which would still be possible.
My .02
Thanks,
Benoit
Le 22/01/2013 07:31, Benoit Marchant a écrit :
Why can we express in a property descriptor the notion of writable, configurable or enumerable but not private?
Because strings are forgeable, meaning that someone you may not trust can read in your code or guess (maybe with low probability) the name of the property making it not-so-private-after-all.
Also, could be off topic, but the fact that for a getter/setter foo property, you have to implement yourself a non-enumerable _foo property to actually have some storage, is not particularly convenient.
That's a terrible idea missing the point of accessors which are expected to encapsulate the state they're dealing with, not force you to put it at everyone's sight. You don't have to do that ...
A solution to that would be welcome! If a local variable following the name of the property was added to the scope of the getter/setter while it's called on an object could be one way, it would certainly encourage following encapsulation rather than accessing a private property directly, which would still be possible. ... and you're providing the solution yourself, getters and setters can share a variable in a common scope. The language can't decide to add its own variable, because it could collide with or shadow an existing variable making the code much harder to understand and reason about. So you have to create the variable yourself.
Interestingly, if instead of a non-enumerable _foo property a private symbol was used, getters and setters would be the property-wise equivalent of proxies; the private symbol playing the role of the target and the publicly exposed string property being the proxy.
Kevin Smith wrote:
In ES5, there is no distinction between "public" and "private" data within an object. If you want to create private data, you must do so using a closure. All private data is therefore "external" to the object in question. The data does not "follow" the object around. This is a simple model. It is easy to reason about. It's not clear that this model is insufficient for our needs.
I have enjoyed the many conversations revolving around WeakMaps & symbols very much, and in the scheme of things I'll be just fine without private symbols (I've made it without them so far). However, I do disagree that it's easy to reason about data in ES5. I do think that it's insufficient, and I would point to the spec itself as evidence that the concept of "private" data significantly improves reasoning when dealing with objects.
The ES5 spec describes built-ins as having "private" data associated with each object (you know those [[InternalProperties]]). It would be perfectly possible for the spec to describe the behavior of built-ins without referring to "private" properties, but the spec uses this model because it is the most convenient model. Yet, ES5 does not provide any means within the language for developers to tie "private" data to objects. (It's always been a strange facet of the ES language that built-ins are able to be more magical than user-defined objects.)
If it was such a simple model to do without "private" data, my questions would be (1) Why does the spec itself not do without the concept when describing built-ins? (2) Why do so many developers use underscore properties to simulate private data? (3) Why do other developers abandon prototypal inheritance and build their objects inside constructors so that they can have private data?
It's my opinion that saying that closures should be used for an object to hold onto private data, as you are advocating, is in conflict with ES's prototypal model of inheritance. Methods cannot both (A) be on a constructor's prototype and (B) live inside the scope used to house private data. The developer is forced to make a decision: Do I want my methods to be defined on the constructors prototype or do I want them to have access to private data?
The fact that ES built-ins' methods are defined on the prototype and have access to private data seems to indicate that the ideal model would allow both. WeakMaps do, to a degree, permit both, but it's just a hack. If it wasn't, the ES spec itself would describe built-ins' private data as living outside the object itself in a WeakMap. The spec doesn't do that because it's unnatural. Private symbols provide a mechanism for tying private data to objects in the most natural and reasonable way.
Nathan
The fact that ES built-ins' methods are defined on the prototype and have access to private data seems to indicate that the ideal model would allow both. WeakMaps do, to a degree, permit both, but it's just a hack. If it wasn't, the ES spec itself would describe built-ins' private data as living outside the object itself in a WeakMap. The spec doesn't do that because it's unnatural. Private symbols provide a mechanism for tying private data to objects in the most natural and reasonable way.
+1 to this and everything else Nathan has said. Watching all this intense back and forth, there are a lot of good points, some of which almost convince me that weak maps are sufficient and private symbols are unnecessary. But when I step back for even a minute, as a developer private symbols are exactly what I want, and weak maps are an un-ergonomic hack.
To put it another way, private symbols have been very easy for me to explain to other developers. I simply give them the prototype example with unsecured underscored properties, then the closure example with bad memory and performance characteristics, then show them the private symbol version which solves both problems in a pretty elegant way with syntax that makes perfect intuitive sense. They provide a method of storing private state on the object in a way that fits with the object model we know and love, as exemplified by the many [[InternalProperty]]s of the spec or the underscored-properties of ES5-era prototypal JavaScript.
So I don't have any overwhelming technical or security arguments, but just from a developer ergonomics and pedagogy perspective, I <3 private symbols and wish I could banish the grim reaper that seems to be hanging over them.
I also agree with everything that Nathan said.
To clarify, there's Symbols and then there's private Symbols. I don't think anyone in TC39 is suggesting the removal of Symbols in general. Private Symbols have a much more specific set of uses cases than do just Symbols in general, and regular Symbols will accomplish the goal of encapsulation. Regular symbols are only enumerated by a new ES6 function and are unique. Currently, the only difference between a normal symbol and a private Symbol is that private symbols are not enumerated by getOwnKeys and they are slated to eventually be awkward to use with proxies. Otherwise a normal (aka Unique) Symbol works exactly the same.
On Tue, Jan 22, 2013 at 8:13 AM, Brandon Benvie <brandon at brandonbenvie.com> wrote:
I also agree with everything that Nathan said.
To clarify, there's Symbols and then there's private Symbols. I don't think anyone in TC39 is suggesting the removal of Symbols in general. Private Symbols have a much more specific set of uses cases than do just Symbols in general, and regular Symbols will accomplish the goal of encapsulation.
That's like saying that ES3 "var" accomplishes the goal of lexical scoping. Never mind that it's broken; it seems to work for most code, and we don't care otherwise.
We can avoid needless terminological debate by explaining unique symbols as what they are -- a way to avoid needless name collisions when naming properties.
On Tue, Jan 22, 2013 at 11:13 AM, Brandon Benvie <brandon at brandonbenvie.com>wrote:
I also agree with everything that Nathan said.
To clarify, there's Symbols and then there's private Symbols. I don't think anyone in TC39 is suggesting the removal of Symbols in general. Private Symbols have a much more specific set of uses cases than do just Symbols in general, and regular Symbols will accomplish the goal of encapsulation.
No, symbols accomplish the goal of stratification [1] (I've used this terminology in the past and was corrected but from a mathematical logic standpoint this property is precisely what symbols give us). Unique symbols fail as a means of encapsulation -- and this is whole point of private symbols.
Regular symbols are only enumerated by a new ES6 function and are unique. Currently, the only difference between a normal symbol and a private Symbol is that private symbols are not enumerated by getOwnKeys and they are slated to eventually be awkward to use with proxies. Otherwise a normal (aka Unique) Symbol works exactly the same.
You're awkwardly (and derisively) describing what could more concisely be called "encapsulation". This is the difference between unique and private symbols, as Nathan expressed beautifully.
It's my opinion that saying that closures should be used for an object to hold onto private data, as you are advocating, is in conflict with ES's prototypal model of inheritance. Methods cannot both (A) be on a constructor's prototype and (B) live inside the scope used to house private data. The developer is forced to make a decision: Do I want my methods to be defined on the constructors prototype or do I want them to have access to private data?
That used to worry me, too, when I came up with my pattern for implementing (TypeScript-style) private slots via bind[1], but currently I think it is inherent (no pun intended;) in private data.
You could have your methods on the prototype and extend/bind them from the constructor to give them access to private data.
However, "private" here means instance-private, so if you have a method that needs access to instance-private data, what is that going to do on the prototype? You could store it there, but have to remember to provide it with that instance-private data when borrowing it.
This is different from class-private static data, and also from "protected" slots, where each object or each method in the prototype chain is supposed to have access. I suppose those could also be modeled using private symbols - private symbols do more than just (instance-)private slots.
Claus
This is very well said. But your assertion that the external model is "unnatural" does not necessarily hold. Again, let's look at this private syntax and expansion using WeakMaps:
This syntax presents an "external" model of private data (and methods!). Do you think this expression of external private relationships is unnatural? In what way is this expression insufficient?
+1 to this and everything else Nathan has said. Watching all this intense back and forth, there are a lot of good points, some of which almost convince me that weak maps are sufficient and private symbols are unnecessary. But when I step back for even a minute, as a developer private symbols are exactly what I want, and weak maps are an un-ergonomic hack.
Are you sure they are what you want? When you attempt to create private methods using private symbols, you will break the inheritance mixin pattern. Is that what we want?
A weakmap-based solution can be made as ergonomic as you like. Without syntax, it can be a simple function call (like Juan Dopazo's), and with syntax...
On Tue, Jan 22, 2013 at 4:03 PM, Kevin Smith <khs4473 at gmail.com> wrote:
+1 to this and everything else Nathan has said. Watching all this intense back and forth, there are a lot of good points, some of which almost convince me that weak maps are sufficient and private symbols are unnecessary. But when I step back for even a minute, as a developer private symbols are exactly what I want, and weak maps are an un-ergonomic hack.
Are you sure they are what you want? When you attempt to create private methods using private symbols, you will break the inheritance mixin pattern. Is that what we want?
The idea of a WeakMap desugaring appeals to me, though I don't understand how this desugaring solves the mixin inheritance problem? I'm probably missing something -- I can't tell how prototypal inheritance is intended to work, if at all, with private properties. Would it disappear? If so, this would break a core tenet of the object model, which seems more detrimental than breaking a very specific pattern (a diamond-shaped antipattern?) in one specific case. Naive mixins are naive -- this seems like an acceptable trade-off, especially when there are patterns with more integrity available.
A weakmap-based solution can be made as ergonomic as you like. Without syntax, it can be a simple function call (like Juan Dopazo's), and with syntax...
If a desugaring could be made to support prototypal inheritance this would be ideal. But assuming private symbols cannot be intercepted by proxies would there be an observable difference between private symbols and this kind of weakmap desugaring (including some kind of gc hint, per the gist)? If none, and Allen finds private symbols helpful for specification purposes, is it worth bothering with a desugaring at all (including whatever it entails, like WeakMap gc hints and any other baggage)?
A Duck/Rabbit Proposal:
- There is only one kind of Symbol
- Private data can be accessed via Object.getPrivate and Object.setPrivate
The goal is to provide a base-level API from which higher-level abstractions can be built.
// === Basic example ===
var symbol = new Symbol(), obj = {};
// Setting private data:
Object.setPrivate(obj, symbol, {});
// Getting private data:
var data = Object.getPrivate(obj, symbol);
Note that this does not preclude the possibility of full private symbols in ES7.
More at:
My initial reaction:
- Hosting getPrivate/setPrivate on Object seems quite strange to me.
- it seems unrealistic to me to ask developers to write so much code every time they want to use their own private instance variable. The notion of Private should be specified once for an instance variable, not every time it's used internally
- I haven't followed the whole thread and I'm not a language developer. I never felt that I needed a private data structure to hold internal "stuff". One can already do that with a closure and a WeakMap manually if one really wants it
- I do want private instance variables that allow me to truly have encapsulation.
Thanks,
Benoit
- Hosting getPrivate/setPrivate on Object seems quite strange to me.
Reserve judgement for a bit on this...
- it seems unrealistic to me to ask developers to write so much code every time they want to use their own private instance variable. The notion of Private should be specified once for an instance variable, not every time it's used internally
Sure - eventually. But this proposal tries to take a smaller step in that direction, without committing too much either way. Also, see the "simple wrapping API" example in the gist. The amount of code is quite small if such a wrapping is used:
let priv = makePrivate();
class Purse {
constructor() {
priv(this, { balance: 0 });
}
getBalance() { return priv(this).balance; }
makePurse() { return new Purse; }
deposit(amount, srcPurse) {
priv(srcPurse).balance -= amount;
priv(this).balance += amount;
}
}
Yes, you have to use the "priv" function. But on the other hand you don't need to create (say) 10 private symbols for 10 properties and use ugly square brackets.
And of course syntax will make all of this more concise in ES7. And when it comes time to design ES7, the committee will have more real world usage to inform the syntax.
- I haven't followed the whole thread and I'm not a language developer. I never felt that I needed a private data structure to hold internal "stuff". One can already do that with a closure and a WeakMap manually if one really wants it
Again, I think it might be more convenient to store properties in one private object, rather than maintain several private symbols. But that said, this proposal will allow you to use as many separate symbols as you like.
- I do want private instance variables that allow me to truly have encapsulation.
Well here ya go! : )
Think of this proposal as a max-min version of privacy. It leaves WeakMap as a pure data structure with specific GC semantics (like Allen wants), while also providing the ability to do efficient rights-amplification (like Mark wants). It simplifies the concept of Symbol (there is only one "kind"). And it leaves a large degree of freedom for future improvements.
thanks for pulling this code example out of the gist and posting separately. Looking at it only in context before, for some reason I hadn't realized how beautiful this is. To support this pattern, your makePrivate() could be defined either in terms of either private symbols or weakmaps, right?
Given how concise and beautiful this is, even if this is defined in terms of private symbols, I agree this looks much better than the square bracket syntax for accessing private fields. It also looks good enough that the hypothetical ES7 syntactic support doesn't look much better -- perhaps not better enough to be worth adding more sugar. As you say, this will give us enough experience with a usable privacy syntax that we can make a more informed choice for ES7 when it comes to that.
Why would you use a square bracket notation rather than a . Property access notation?
[] is typically only use when the property name is in a variable, which is not the case when you write your own object.
Benoit
On Wed, Jan 23, 2013 at 8:59 AM, Mark S. Miller <erights at google.com> wrote:
Hi Kevin, thanks for pulling this code example out of the gist and posting separately. Looking at it only in context before, for some reason I hadn't realized how beautiful this is. To support this pattern, your makePrivate() could be defined either in terms of either private symbols or weakmaps, right?
Given how concise and beautiful this is, even if this is defined in terms of private symbols, I agree this looks much better than the square bracket syntax for accessing private fields. It also looks good enough that the hypothetical ES7 syntactic support doesn't look much better -- perhaps not better enough to be worth adding more sugar. As you say, this will give us enough experience with a usable privacy syntax that we can make a more informed choice for ES7 when it comes to that. Thanks!
FYI, this is essentially identical to the 'Confidence' abstraction I developed for domado.js in Caja. Perhaps the choice could be further informed by looking at how it's worked out there.
On Wed, Jan 23, 2013 at 9:45 AM, Kevin Reid <kpreid at google.com> wrote:
FYI, this is essentially identical to the 'Confidence' abstraction I developed for domado.js in Caja. Perhaps the choice could be further informed by looking at how it's worked out there.
Perhaps I should have included a link: code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/domado.js?spec=svn5223&r=5223#359
The idea is that 'Confidence' introduces a 'class with private fields' as if in Java: each object which has a private state record is considered to be an instance. The private record is used identically to Kevin Smith's examples, but my analogue of getPrivate on a new object fails hard rather than creating one — analogous to a ClassCastException.
Perhaps I should have included a link:
The idea is that 'Confidence' introduces a 'class with private fields' as if in Java: each object which has a private state record is considered to be an instance. The private record is used identically to Kevin Smith's examples, but my analogue of getPrivate on a new object fails hard rather than creating one — analogous to a ClassCastException.
I apologize for being lazy, but you provide an example of this being used and not just the implementation?
Hi Kevin, thanks for pulling this code example out of the gist and posting separately. Looking at it only in context before, for some reason I hadn't realized how beautiful this is. To support this pattern, your makePrivate() could be defined either in terms of either private symbols or weakmaps, right?
Sure - but note that the implementation in the gist doesn't use either, per se. It uses a new base-level API which provides the ability to get or set private data for any (object, symbol) pair. Advantages of this approach are:
- No need for for special case "private" object slots.
- Only one kind of Symbol.
- We can leave WeakMap as a strictly ephemeron data structure, like Allen wants. No GC hint required.
I find this to be an intriguing "middle way". Here is the implementation in full:
function makePrivate() {
// Create a symbol that names the private data
var symbol = new Symbol();
// Create a function for accessing the private data on some object
var priv = (obj, ...args) => {
if (args.length === 0) return Object.getPrivate(obj, symbol);
else return Object.setPrivate(obj, symbol, args[0]);
};
// Return the symbol so that we can share it...
priv.symbol = symbol;
return priv;
}
On Wed, Jan 23, 2013 at 11:15 AM, Russell Leggett <russell.leggett at gmail.com
wrote:
Perhaps I should have included a link:
The idea is that 'Confidence' introduces a 'class with private fields' as if in Java: each object which has a private state record is considered to be an instance. The private record is used identically to Kevin Smith's examples, but my analogue of getPrivate on a new object fails hard rather than creating one — analogous to a ClassCastException.
I apologize for being lazy, but you provide an example of this being used and not just the implementation?
All of the examples within Caja are rather hairy, so I'll translate the above Purse-using-makePrivate example into the style which I would write it using Confidence. Note that this is code that runs now, under ES5; if I had left in the class syntax that was used in the original then it would have been identical except for the different choices of names.
If you would like me to discuss an actual example from Caja instead, feel free to ask.
var PurseConf = new Confidence('Purse');
var m = PurseConf.protectMethod;
var p = PurseConf.p;
function Purse() {
PurseConf.confide(this);
p(this).balance = 0;
}
Purse.prototype.getBalance = function() {
return p(this).balance;
};
Purse.prototype.makePurse = function() { return new Purse; };
Purse.prototype.deposit = m(function(amount, srcPurse) {
p(srcPurse).balance -= amount;
p(this).balance += amount;
});
Note that the deposit method's protectMethod wrapper ensures that it will not be invoked with a bogus 'this', which would otherwise allow srcPurse to be drained to nowhere by crashing on the second line; this is not actually useful here since purses may be unreferenced, but in other cases such a precondition may be important.
Note that when both protectMethod and p are used, there is a redundant WeakMap lookup. An alternate design would be for the protectMethod wrapper to pass an additional argument to the wrapped function which is the private-state record. This could be considered to have the advantage of encouraging defining explicit operators on the private state (wrapped functions) rather than just 'pulling it out of the object'.
I am not arguing that something like this is the right abstraction for private state in ES6; only, given that the idea has arisen independently, I note that we have some prior experience with it, and it has turned out mostly all right. However, ignoring efficiency of implementation on current ES5, I would myself rather see a mechanism which did not have any private-state-record object, but rather had a separate 'symbol' object identifying each 'private property'; this has the advantage of ensuring the smallest 'scope' of the access to private state. For example, when a 'class' has private state and it has a 'subclass' which has additional private state, the private-state-record pattern encourages the subclass to use the same record (if it is defined in the same place and so has access) which, besides increasing the chance of name collisions, also means that the subtype's methods do not automatically fail when applied to instances of the supertype and so may have unintended consequences.
Why would you use a square bracket notation rather than a . Property access notation?
[] is typically only use when the property name is in a variable, which is not the case when you write your own object.
True - but for symbols, your only option is square brackets:
var sym = new Symbol(), obj = {};
obj[sym] = 42;
So if you go the private symbols route, then you're going to have lots of square brackets.
On Jan 24, 2013, at 5:47, Kevin Smith <khs4473 at gmail.com> wrote:
Why would you use a square bracket notation rather than a . Property access notation?
[] is typically only use when the property name is in a variable, which is not the case when you write your own object.
True - but for symbols, your only option is square brackets:
var sym = new Symbol(), obj = {}; obj[sym] = 42;
So if you go the private symbols route, then you're going to have lots of square brackets.
Not in the case where you would use a private property like a regular one if the language were to offer to tag properties as private, right?
I guess I don't quite understand why it seems contentious to add a "private" property to property descriptors which already "reserve" properties like "value", "enumerable" or "writable".
"private" is a meta description of a property like "value", "enumerable" or "writable.
That feels a more natural extension than adding class to the language.
Since on the topic of adding more property descriptors, one thing we played with in Montage is the notions of "distinct" property descriptor. One amazing aspect of Ecmascript's prototyping inheritance is it's ability to allow you to set default values on the objects that one use as prototypes for others. This is a one time operations, compared to setting initial default state per instance creation, being in a constructor or init method. It's really efficient in term of memory footprint as well, and the prototype lookup is so optimized now that it works well. There's however a big problem with mutable objects like arrays because if it's on the prototype, it's shared by all instances. When that's what you want and sometime you do, great. But if what you had in mind was that each object should have it's own array for that property, debugging the fist time is fun! Yes, you can do that in a constructor or init method. But another way is to add to the property descriptor a property "distinct": true . Which means that each object inheriting from that object would get it's own copy of that object at creation time.
One problem we found working on that is that unfortunately the language let you add these properties to the object used as property descriptors but it's lost after that, meaning that you need to keep a parallel storage to keep track of it.
All this is more declarative than imperative since you can do it yourself in constructor/init method, but it adds semantic and cut down code to write, while being executed in native code by the language itself.
Just wanted to share that.
Thanks!
Benoît
Benoit Marchant wrote:
I guess I don't quite understand why it seems contentious to add a "private" property to property descriptors which already "reserve" properties like "value", "enumerable" or "writable".
"private" is a meta description of a property like "value", "enumerable" or "writable.
That feels a more natural extension than adding class to the language.
As David wrote, this does not work in a dynamic language.
function generic_get(obj, prop) { return obj[prop]; }
obj = {private foo: 42, get bar() { return this.foo; }}; // or equivalent class syntax
// elsewhere var steal = generic_get(obj, 'foo');
How do you enforce that only bar can access foo from obj? A private attribute on a property with a public string-equated name 'foo' does not help.
The mistake is treating name privacy as a property attribute. Privacy is not an attribute of property descriptors, it's a restriction on property names. It is not a static restriction in any sense, rather a capability. If you keep the private symbol or weak map confined, privacy is assured.
Thanks Brendan, that make sense but that's more than what I think I'm looking for. What I'm looking for is a way to store a value in a object's property that can only be accessed from a property access stand point by the object itself. My only goal is to make sure that outside code can't break encapsulation for code robustness/quality reason.
Isn't it possible internally to allow a property access only by "this" ?
Benoit
Isn't it possible internally to allow a property access only by "this" ?
Ah-hem:
priv(this).anyPropertyYouWantPlaya;
: )
Using the definition of "priv" from previous messages is this thread.
Benoit Marchant wrote:
Isn't it possible internally to allow a property access only by "this" ?
No. For one thing, the design has to include class-private instance variables, not instance-private, so you need other.foo as well as this.foo (consider private x and y for Point2D add method).
Also, again, we're not enforcing privacy at the descriptor level with some kind of access control monitor, rather through "names" (whether symbols or weakmaps) as capabilities. Avoids confused deputy attacks.
== Duck, or Rabbit? ==
The debate on whether to express encapsulated state using WeakMaps or private symbols reminds me of this famous image:
upload.wikimedia.org/wikipedia/commons/4/45/Duck-Rabbit_illusion.jpg
Is private state a duck (WeakMap) or is it a rabbit (private symbol keyed object property)? Well, both and neither.
A private symbol is a relationship which conceptualizes the private data "inside" the object, and which favors time over space. After all, we can leak garbage by simply dropping private symbol variables.
A WeakMap is a relationship which conceptualizes the private data "outside" of the object, and which favors space over time. If the WeakMap is unreachable, then we can collect the private data.
In either case, we are dealing with two separate, orthogonal issues:
These questioned should be resolved independently. Doing so will allow us to avoid the duck vs. rabbit dilemma.
== Where Does the Data Live? ==
In ES5, there is no distinction between "public" and "private" data within an object. If you want to create private data, you must do so using a closure. All private data is therefore "external" to the object in question. The data does not "follow" the object around. This is a simple model. It is easy to reason about. It's not clear that this model is insufficient for our needs.
Introducing an additional abstraction which places private data inside of the object represents a fundamental change in the ES conceptual model. We should strive to avoid such additions to object semantics, if other means are possible.
== Time or Space? ==
The decision of whether to implement private data relationships inside of an ES engine as links from one object to another, or as an entry in a supplemental data structure, is a question of optimization, not conceptualization. A user (or the engine itself) should be able to decide on the characteristics of this optimization without affecting the conceptualization of the relationship.
== Conclusion ==
Private data should be represented as "outside" of the object in question. Syntax can be added in future versions of the language to make the expression of such private relationships more concise. If engines are not capable of optimizing the relationship appropriately for time or space, there should be a way for the user to select the optimization preference, either through a WeakMap option or through a separate class with an identical API.