Catchall proposal
On May 6, 2009, at 8:22 AM, David-Sarah Hopwood wrote:
The ability to define 'has', 'get', and 'invoke' handlers for the case where the property exists, is definitely needed IMHO. Suppressing the visibility of a property is potentially as useful as creating a "virtual" property, especially for the secure JS subsets.
Good feedback. What do other secure subsetters say?
(Defining both the "always" and "missing" versions of a handler in the descriptor is not useful, and could be an error.)
Does the has : get :: add : set approach have any advantages in your
view? I could see allowing both has and get, and add and set.
This means that "throw e", where e might have come from an unknown source, has to be avoided in a handler in favour of something like "throw e === DefaultAction ? new Error() : e". Yuck.
The same problem exists for a return value encoding "DefaultAction".
There's always a pigeon-hole problem somewhere, but exceptions are
less used than return values, and a singleton exception object
constant is less likely to be misused, as well as being efficient to
handle in the VM.
Runaway prevention: should a catchall, while its call is active,
be automatically suppressed from re-entering itself for the given
id on the target object?
I think that all catchalls on a given object O, not just those for the same id, should be suppressed when handling a catchall for O. If you want the behaviour that would occur as a result of triggering a catchall for another property, then it is easy to inline that behaviour in the handler. But if you want to suppress the catchall behaviour for another property while in a handler, then it would be difficult to do so under the semantics suggested above.
Why would you want to suppress the catchall behavior for another
property while handling the first property? I'm looking for real-world
use-cases.
Both inlining and suppressing an otherwise-nesting catchall seem
plausible. Our experience with internal catchall APIs in SpiderMonkey
has favored suppressing (obj, id), not just obj.
One hard case is bootstrapping the Object and Function built-in
constructors lazily. You have to do both at the same time, since
Function.prototype's proto is Object.prototype but Object is an
instance of Function. Yet a script could mention either Object or
Function (or define or express a function) in either order. We inline
some logic to handle the coupled nature of the two, but we rely on the
suppression using (obj, id) to detect what action by the script
triggered the lazy initialization.
Either way could be "coded around", but the (obj, id) way let the
runtime share the suppression machinery in a detectable way for use by
others. I believe we have lazy DOM init code that always benefits from
not requiring inlining.
I know of no case where we want an otherwise-nesting catchall to be
suppressed.
(Also the per-object suppression is easier to specify; it just requires an [[InCatchall]] flag on each object.)
True (although optimizing VMs have few bits for such things and may
use a hash table, so storing the id too is less of a burden).
David-Sarah Hopwood wrote:
Brendan Eich wrote: [...]
I finally found time to write up a proposal, sketchy and incomplete, but ready for some ever-lovin' es-discuss peer review ;-).
Catchalls are sometimes thought of as being called for every
access to a property, whether the property exists or not.
The ability to define 'has', 'get', and 'invoke' handlers for the case where the property exists, is definitely needed IMHO. Suppressing the visibility of a property is potentially as useful as creating a "virtual" property, especially for the secure JS subsets.
I agree however that this is only needed for some use cases, and for those case in which it is not needed, it would be inconvenient (and less efficient) to require has/get/invoke handlers to perform the default action. Defining 'has' and 'hasMissing', 'get' and 'getMissing', etc. appears to solve this problem, and I think that the extra complexity is justified.
To flesh this out a bit more, I propose the following handlers:
has(id) hasMissing(id) get(id) getMissing(id) set(id, val) setMissing(id, val) invoke(id, args) invokeMissing(id, args) delete(id) call(args) new(args)
'add' in Brendan's proposal is effectively renamed to 'setMissing', except that the initial value is passed to 'setMissing' just as it would be for 'set'.
'call' handles calls to the object using the function call syntax.
'new' handles calls to the object using 'new ...(...)'. (Note that keywords can be used as property names as of ES5.)
There is no need for a 'deleteMissing' handler.
It is an error (causing defineCatchAll to throw) if both 'foo' and 'fooMissing' are present for foo = {has, get, set, invoke}.
While a catchall is entered for object O, an O.[[InCatchall]] flag is set that suppresses all catchalls for O, i.e. reverting to the default ES5 behaviour.
Catchall handlers are called with 'this' bound to the object on which the catchall was triggered.
The 'DefaultAction' idea is not used.
On Wed, May 6, 2009 at 9:10 AM, Brendan Eich <brendan at mozilla.com> wrote:
On May 6, 2009, at 8:22 AM, David-Sarah Hopwood wrote:
The ability to define 'has', 'get', and 'invoke' handlers for the case where the property exists, is definitely needed IMHO. Suppressing the visibility of a property is potentially as useful as creating a "virtual" property, especially for the secure JS subsets.
Good feedback. What do other secure subsetters say?
Good thread! I only have time to skim it at the moment, but it is of central importance to improving the securability of ES. Thanks for putting a concrete catchall proposal out there and getting us started on this. This is important.
As for the particular point, I don't understand it. How would a secure subset use these hooks to suppress, for example, domNode.innerHTML or window.eval from being visible to "untrusted" code (i.e., code constrained to be in SES -- a std Secure EcmaScript to emerge from ADsafe, Jacaranda, dojox.secure, and Cajita) but not suppress its visibility to normal ES code? Especially if we wish the future of secure subsetting to be verification-based (as in Jacaranda, dojox.secure, and ADsafe) rather than translation-based (as in current Cajita)? If we can make verification work with adequate safety, I'm all for it. But I don't see how the proposal above helps, since a catchall cannot (and should not be able to) sense whether its caller is in SES or full ES.
Note that the translation-basis of Valija, WebSandbox, and FBJS2 is a distinct issue, since the goal there is to instantiate multiple emulated global environments within a single actual global environment. Catchalls may or may not help here as well, but the question is so different that I suggest we postpone it until we first understand the issues at the SES level.
On May 6, 2009, at 9:37 AM, Mark S. Miller wrote:
Note that the translation-basis of Valija, WebSandbox, and FBJS2 is a distinct issue, since the goal there is to instantiate multiple emulated global environments within a single actual global environment. Catchalls may or may not help here as well, but the question is so different that I suggest we postpone it until we first understand the issues at the SES level.
Let's cut to the chase if we can. I don't buy the SES first, since
it's not clear SES is usable but Web Sandbox and others like it are
already in use.
On Wed, May 6, 2009 at 10:11 AM, Brendan Eich <brendan at mozilla.com> wrote:
Let's cut to the chase if we can. I don't buy the SES first, since it's not clear SES is usable but Web Sandbox and others like it are already in use.
For the record, other like it are indeed already in use (FBJS[2?] on Facebook, Valija on YAP). AFAICT, WebSandbox itself isn't yet. This isn't quite fair as we intend to borrow many of their good ideas.
We've been using Cajita enough internally at the Caja project, and it is similar enough in flavor to E, that I am confident it is usable. Yes, these scale of these is infinitesimal compared to historic ES use. Based on the record to date, it is clear that full ES has not been usable as a secure language.
And in any case, you started this subthread by asking "What do other secure subsetters say?" SES is a subset. WebSandbox, Valija, and FBJS2 ideally aren't -- they are a virtualization/emulation of the "whole" language (given a suitable definition of "whole" -- Valija will do ES5-strict). These whole language emulations are not "secure", they are only sandboxed. Since the language being emulated is not secure, a faithful emulation of that language must preserve its insecurities.
Both the Cajita-like and Valija-like levels are important. For the Cajita-like level, the remaining pressing issue beyond ES5 is the whitelisting issue David-Sarah raises. For the Valija-like level, I think the most important enabler would be some kind of hermetic eval or spawn primitive for making a new global context (global object and set of primordials) whose connection to the world outside itself is under control of its spawner. With such a primitive, we would no longer need to emulate inheritance and mutable globals per sandbox.
Catchalls are likely to be a relevant enabler at both the Cajita-like and Valija-like levels. We can discuss both simultaneously if you like, but I suggest we would have a clearer discussion if we take these mostly in bottom up order.
On May 6, 2009, at 10:53 AM, Mark S. Miller wrote:
On Wed, May 6, 2009 at 10:11 AM, Brendan Eich <brendan at mozilla.com>
wrote:Let's cut to the chase if we can. I don't buy the SES first, since
it's not clear SES is usable but Web Sandbox and others like it are already
in use.For the record, other like it are indeed already in use (FBJS[2?] on Facebook, Valija on YAP). AFAICT, WebSandbox itself isn't yet. This isn't quite fair as we intend to borrow many of their good ideas.
We've been using Cajita enough internally at the Caja project, and it is similar enough in flavor to E, that I am confident it is usable.
I didn't mean to pick on Caja. It does seem the Web Sandbox approach,
FBJS2, etc. is more likely to be usable, all else equal.
Yes, these scale of these is infinitesimal compared to historic ES use. Based on the record to date, it is clear that full ES has not been usable as a secure language.
The question of a programming language being secure is interesting,
but it may miss the target given security properties being end-to-end
system properties. A language with better properties than ES may be
necessary, but not sufficient, so stripping ES down to something
secure may not improve practical security, and could impair usability
by comparison to alternative approaches.
We need better security by default in JS+HTML+DOM+CSS+SVG+XSLT+... --
we can't throw the system out, but we can evolve it with some
compatibility breaks.
This is a big topic, I'll save it for another day.
And in any case, you started this subthread by asking "What do other secure subsetters say?" SES is a subset. WebSandbox, Valija, and FBJS2 ideally aren't -- they are a virtualization/emulation of the "whole" language (given a suitable definition of "whole" -- Valija will do ES5-strict).
Yes, sorry for using subsetters when I meant to summon sandboxers.
Catchalls are for various use-case, mainly the virtualization one used
for sandboxing and emulating.
These whole language emulations are not "secure", they are only sandboxed. Since the language being emulated is not secure, a faithful emulation of that language must preserve its insecurities.
Or solve them otherwise.
You assume people have to write in what you call a secure subset or
live with insecurities in a sandbox that confines threats. I don't
think that's the only way to solve the whole-system problem, i.e, to
enforce security properties like integrity and secrecy against mixed
trust threats.
Especially since, as noted above, security properties need to be
enforced in the system, not just the language. Combine this conern
with usability issues in the subsets (starting with the need to
relearn and rewrite), and the suspicion grows that an end-to-end and
system-wide approach might yield both better security property
enforcement and greater usability.
But again this is a bigger topic than catchalls and I will shut up for
now.
Both the Cajita-like and Valija-like levels are important. For the Cajita-like level, the remaining pressing issue beyond ES5 is the whitelisting issue David-Sarah raises.
That is gonig to be a struggle. Whitelisting in the VM might be useful
but there are many ways to do it, with different performance vs.
expressiveness tradeoffs. Separate thread? Restart the one DSH posted
in if need be.
For the Valija-like level, I think the most important enabler would be some kind of hermetic eval or spawn primitive for making a new global context (global object and set of primordials) whose connection to the world outside itself is under control of its spawner.
Mozilla has had such a thing (Components.utils.evalInSandbox) for a
while and it's indispensable. So yeah, let's standardize a solution
everyone can use.
With such a primitive, we would no longer need to emulate inheritance and mutable globals per sandbox.
Who "we"? People writing or adapting web JS might. There's no free
lunch but there might be a lower rewrite tax. Or no tax in the JS
language level, more in policy -- site-specific content restrictions
and policy declarations.
Catchalls are likely to be a relevant enabler at both the Cajita-like and Valija-like levels. We can discuss both simultaneously if you like, but I suggest we would have a clearer discussion if we take these mostly in bottom up order.
Your taxonomy is not shared by everyone, and it contains assumptions
I've questioned. I'd rather start from real-world use-cases wherever
they may be.
On Wed, May 6, 2009 at 12:55 PM, Brendan Eich <brendan at mozilla.com> wrote:
Your taxonomy is not shared by everyone, and it contains assumptions I've questioned.
Of course. I'm trying to open discussion of these, not end discussion. Please question away.
I'd rather start from real-world use-cases wherever they may be.
Caja on YAP is a deployed real world use case of Valija layered on Cajita.
On May 6, 2009, at 1:14 PM, Mark S. Miller wrote:
I'd rather start from real-world use-cases wherever they may be.
Caja on YAP is a deployed real world use case of Valija layered on
Cajita.
Cool, you know more about it than I do -- what does it want from
catchalls that's not there? What is proposed that it does not want?
On Wed, May 6, 2009 at 12:55 PM, Brendan Eich <brendan at mozilla.com> wrote:
[...] This is a big topic, I'll save it for another day. [...] But again this is a bigger topic than catchalls and I will shut up for now.
Language design is, among many other things, a form of search in a large combinatorial space where we can only afford to evaluate a small fraction of states in any detail. One of the best heuristics when doing backtracking search is "make the most constrained choices first".
So please don't shut up for now. IMO, these security issues are the most important we face. But even for those who don't share my sense of relative importance, security are also the most constraining, and so should be addressed early.
Subject: Re: Catchall proposal David-Sarah Hopwood wrote: <snip> To flesh this out a bit more, I propose the following handlers:
has(id) hasMissing(id) get(id) getMissing(id) set(id, val) setMissing(id, val) invoke(id, args) invokeMissing(id, args) delete(id) call(args) new(args)
Would it be worth splitting the above into two separate proposals?
Proposal #1) Hooks for only missing actions.
Proposal #2) Pre/Post Hooks for existing actions.
I would expect that the goals for each of the two proposals would be slightly different.
The focus of #1 would be to enhance the run-time metaprogramming facilities
The focus of #2 would be to enhance the security/correctness (visibility, pre/post conditions, design by contract) of programs and support aspect-oriented programming.
That way, one doesn't get mired by the other ;)
Any thoughts? (i.e. are these two proposals far too intertwined to separate them)
thanks, Faisal Vali Radiation Oncology Loyola
Er, in my previous post, please replace 'actions' with 'properties'
Proposal #1) Hooks for only missing actions. Proposal #2) Pre/Post Hooks for existing actions
sorry about that, Faisal Vali Radiation Oncology Loyola
Waldemar gave feedback
Waldemar's main point is a good one, which caused me to be hostile to
catchalls for years:
Catchalls climb the meta ladder. Primitive actions such as checking
for the existence of a property when climbing the prototype chain can
now run arbitrary code. This is unacceptable in some situations and
makes the ES5 meta-object methods non-primitive, so we will need yet
another level of such methods. It is likely to open another area for
security attacks to hide in.
He also makes a good point about for-in, which I'll address in the
wiki with a link to a strawman:iterators page. It could be combined
with catchalls but I'd like to separate concerns for now.
I was planning to update the catchalls proposal to do what David-Sarah
Hopwood suggested: brute-force split of set/setMissing, etc. for all
the hooks. But it would be helpful to discuss a bit more here, then go
back to the wiki. So thanks for posting.
To clarify: you are proposing two separate proposals, presumably
meaning two API methods of Object -- defineCatchAll for missing
properties and defineAdvice or some such for pre- and post- hooks. Is
that right?
AOP, sigh: "before" and "after" advice for existing properties could
be defined as a normative (i.e., mandatory to implement) part of the
core language, but the idea with this proposal was to let the
catchalls-using library ecosystem invent such things. Dojo, JQuery,
etc. already have AOP support.
Catchalls are meant to be the simplest step up the "meta ladder" that
the standard language needs to take, for the rest of the world to
build fancier MOPS and AOP systems as they see fit.
In this light could we avoid Object.defineAdvice and provide
Object.defineCatchAll hooks for existing properties, with a way to
trigger default action that allows the "after" advice to be run?
Instead of throwing a DefaultAction singleton exception, this would
seem to want a built-in method to invoke to request the default
action, passing a callback for the "after" hook. Comments?
Brendan Eich wrote: [...]
Catchalls are sometimes thought of as being called for every
access to a property, whether the property exists or not.
The ability to define 'has', 'get', and 'invoke' handlers for the case where the property exists, is definitely needed IMHO. Suppressing the visibility of a property is potentially as useful as creating a "virtual" property, especially for the secure JS subsets.
I agree however that this is only needed for some use cases, and for those case in which it is not needed, it would be inconvenient (and less efficient) to require has/get/invoke handlers to perform the default action. Defining 'has' and 'hasMissing', 'get' and 'getMissing', etc. appears to solve this problem, and I think that the extra complexity is justified.
(Defining both the "always" and "missing" versions of a handler in the descriptor is not useful, and could be an error.)
Defaulting: sometimes a catchall wants to defer to the default
action specified by the language’s semantics, e.g. delegate to a
prototype object for a get. The ES4 proposal, inspired by Python
and ES4/JS1.7+ iteration protocol design, provided a singleton
exception object, denoted by a constant binding, DefaultAction,
for the catchall to throw. This can be efficiently implemented
and it does not preempt the return value.
This means that "throw e", where e might have come from an unknown source, has to be avoided in a handler in favour of something like "throw e === DefaultAction ? new Error() : e". Yuck.
Runaway prevention: should a catchall, while its call is active,
be automatically suppressed from re-entering itself for the given
id on the target object?
I think that all catchalls on a given object O, not just those for the same id, should be suppressed when handling a catchall for O. If you want the behaviour that would occur as a result of triggering a catchall for another property, then it is easy to inline that behaviour in the handler. But if you want to suppress the catchall behaviour for another property while in a handler, then it would be difficult to do so under the semantics suggested above.
(Also the per-object suppression is easier to specify; it just requires an [[InCatchall]] flag on each object.)