"use subset" introductory material
[+es4-discuss]
On Mon, Jun 30, 2008 at 9:18 PM, Allen Wirfs-Brock <Allen.Wirfs-Brock at microsoft.com> wrote:
Mark: To the degree you are agreeing with Maciej about interoperability concerns, I've already addressed that. I'm trying to trade-off your safety objective against compatibility issues and what I see as a overall lack of consensus about what is required or alternatively tolerated in a "strict mode". Making enforcement optional seem to increase the possible space of compromise.
Personally, I would be perfectly happy to require support for "use subset cautious" but I'm concerned that it could make it harder to get the consensus we need.
Making the enforcement of "cautious" optional makes it almost useless for security. And I don't think we're that far from consensus. The ES4 folks have not wavered in their intention that ES4 strict be enforced. ES3.1 cautious is almost identical to the intersection of ES3.1 and ES4 strict. The only remaining disagreements I know of that we have not been able to resolve are
- how one indicates that a strict function is of variable arity.
- whether "with" is allowed.
Given that ES4 strict is an enforced subset, I don't see a huge burden in also enforcing cautious.
Btw, my preference would still be to resolve the above two bullet points, so that cautious can be a proper subset of ES4 strict. But even if we do, this won't be the last disagreement about the nature of strict subsets. At the last F2F, Brendan and Lars raised the possibility of future "use stricter" directives of some sort, for future EcmaScripts. Allen's "use subset X" notion accommodates such possibilities naturally.
On Jun 30, 2008, at 9:08 PM, Mark S. Miller wrote:
[+es4-discuss]
On Mon, Jun 30, 2008 at 7:37 PM, Maciej Stachowiak <mjs at apple.com>
wrote:JSON be handled with a generic subset mechanism? I expect not,
since a pragma inside the JSON source in the form of an initial quoted
string would be (a) invalid JSON and (b) ineffective as a way to validate
incoming JSON, since malicious alleged JSON would not use such a pragma.Whether or not it's a good idea, given "use subset JSON" as a recognized/enforced subset directive, one could trivially implement JSON.stringify(str) in terms of
eval('"use subset JSON"; (' + str + ')')
That's nothing like how JSON parsing is implemented in Mozilla. If
the idea is to add a mode to the ES parser, then I'm worried about
missed exclusion tests and false economies in a hacked up JS parser
trying to serve two (or more) masters.
JSON is defined by www.ietf.org/rfc/rfc4627.txt -- not by
ES1-3 or any future spec. It's a "subset" of Python and other
languages -- it's more accurately its own language. It's better off
with ts own parser implementation, unit tests, etc. -- browsers want
this for application/json handling anyway (no pragma or restrictive
API mode required).
Given that JSON.stringify is a proposed extension in ES3.1 (and was
slated to be in ES4, after we rejected the old json.org API), why
does the above trivial (except for possibly non-trivial risks in
subsetting a real JS parser) re-implementation via eval matter?
I do think JSON should be supported natively, but it does not seem
at all analogous to strict mode / cautious subset.I think I agree. In any case, I agree that JSON is not by itself a compelling case for "use subset X". My point is only that JSON is a huge counter-example to Brendan's statement that "profiled (subsetted) standards are meaningless to harmful on the web".
JSON is not huge, and that's one point in favor of keeping it
separate from ES futures. It is not defined as a subset in any ES spec.
It's also not an intentional, new-in-the-last-month, paper-spec-only
subset of JavaScript -- it is a subset after the fact. As Doug has
written, he "discovered" it. Inventing new, multiple, as-yet-unused
subsets for ES3.1 -- and not implementing any of them in any
experimental-to-beta released browser, especially not in IE8 -- is a
bad idea. It will cause general and widespread opposition to any
attempt to standardize such a ES3.1 this year.
At least OOXML and E4X (to name two Ecma standards of mixed repute)
each had one implementation -- however buggy or deficient some have
argued those specs and their single implementations were. ES3.1 has
none, not even a buggy work-in-progress reference implementation.
My point is to recall the original "ES3 + reality" anti-mission-creep
goal for ES3, which you among others espoused. Right now it's on a
road to completion at the same time frame as a cut-down ES4, which
will make for a busy 2009 -- assuming its supporters actually
demonstrate it in several testable, interoperating implementations.
JSON was defined as an enforced subset of JavaScript, and it has been extraordinarily helpful to the web.
Except where people used JS parsers naively. Which is one variation
on a theme that you are still playing in advocating "use subset JSON".
On Jun 30, 2008, at 9:18 PM, Allen Wirfs-Brock wrote:
This is an interesting exercise because I'm trying to find a path
that is tolerable for a number of different individuals who seem to
have different and sometimes seemingly contradictory goals:
That is a warning sign. Adding more switches won't make everyone
happy, and it will make a nightmare of a testing matrix, within a
single implementation and among implementations.
Brendan: I don't see these subsets as "profiles", if a profile is a
subset of the language that an implementation is allowed to limit
itself to supporting.
That's my point -- profiled standards may pretend that
implementations can pick and choose among profiles to support, but on
the web's winner-take-all network effects force every implementation
to agree on the full standard. Users choosing subsets from full
implementations do not simplify the implementation space, and too
many subsets make a mess of the spec and its implementation.
What users -- as opposed to es3.x-discuss participants who may have
different and seemingly contradictory goals -- have asked for these
subsets?
Maciej: The conditions for violating the "cautious" will be part of
the 3.1 specification. I actually don't understand your
interoperability concerns.
The complexity of multiple subset modes, or whatever they are, will
increase the odds of interoperation bugs in the wild.
There is only one language, the full language since every valid
"cautious" subset program (or any other subset, for that matter)
must execute identically in the full language (except for possibly
fewer error exceptions). The "use subset" directive is a statement
of voluntary self constraint on the part of the programmer. Of
course, there is no particular reason for an implementation to
accept the programmer's assertion regarding their self constraint.
So the directive also acts as authorization from the programmer to
the implementation that it may reject or restrict a program unit
according to a subset definition. A wise programmer will test any
such constrained program on an implementation that actually
enforces the subset limitations. They would also want to test it
without the limitations.
There goes the QA budget :-/.
This is a recipe for testing costs driving everyone away from all but
one mode, probably the one that's web-compatible.
Regarding whether or not ES4 is in agreement...Before anybody can
agree or not to anything, there first has to be a proposal on the
table to consider. That's what I'm trying to put on the table.
How many proposed subsets or modes are there?
Mark: To the degree you are agreeing with Maciej about
interoperability concerns, I've already addressed that. I'm trying
to trade-off your safety objective against compatibility issues and
what I see as a overall lack of consensus about what is required or
alternatively tolerated in a "strict mode". Making enforcement
optional seem to increase the possible space of compromise.Personally, I would be perfectly happy to require support for "use
subset cautious" but I'm concerned that it could make it harder to
get the consensus we need.
These statements are grounds for leaving out any subset modes from a
standard that you hope to finish this calendar year.
On Mon, Jun 30, 2008 at 11:49 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Jun 30, 2008, at 9:18 PM, Allen Wirfs-Brock wrote:
This is an interesting exercise because I'm trying to find a path that is tolerable for a number of different individuals who seem to have different and sometimes seemingly contradictory goals:
That is a warning sign. Adding more switches won't make everyone happy, and it will make a nightmare of a testing matrix, within a single implementation and among implementations.
I agree with this. While there are many ideas for possible subsets floating around (Caja, Cajita, ADsafe, Jacaranda, Capsol), I think it is really important to reduce the number the standardized subsets down to the absolute minimum we can agree on. Within the ES3.1/ES4 timeframe, the subsets we're talking about standardizing are exactly the same as those we have been discussing. We are only proposing a trivial renaming to clarify the distinctions we've all already been discussing, and to leave room open for the future definition of other subsets.
If the ES4 and ES3.1 folks can agree on their respective meanings of the subset named "strict", such that
- ES3.1 < ES4
- ES3.1 strict < ES3.1
- ES4 strict < ES4
- ES3.1 strict < ES4 strict
then we'd only need the two subset names needed for these distinctions. Even then, making the syntax change from
"use strict";
to
"use subset strict";
still seems like a good idea to me. At the last F2F, Lars and Brendan raised the possibility that they might like to define stricter modes in successors of ES4. The notion of named subsets would seem to accommodate that. Brendan & Lars, in the absence of named subsets, how would you introduce such stricter modes in that future? What would you propose?
Of the above desired subset relationships, the problematic one is the last bullet above. The outstanding disagreements are variable arity strict functions and "with". Hopefully we can resolve these in Oslo. In the meantime, to avoid stepping on each other's toes, we've renamed "ES3.1 strict" to "cautious". That's all that's going on. I'm surprised there's such a fuss.
Brendan: I don't see these subsets as "profiles", if a profile is a subset of the language that an implementation is allowed to limit itself to supporting.
That's my point -- profiled standards may pretend that implementations can pick and choose among profiles to support, but on the web's winner-take-all network effects force every implementation to agree on the full standard.
Aren't you, Crock, and the rest of us vigorously agreeing? The point of these subsets isn't "profiles". It is the benefits that following from knowing (or enforcing) that the execution of particular program units are constrained to be within useful subsets of the language. The motivation should be clear: The language as a whole is a mess. However, various subsets of the language, like ES4 strict for example, have much better properties. For code constrained to execute according to one of these subsets, the semantics of this code can often explained using a simpler theory than needed to explain the semantics of the full unconstrained language. For example, scoping in ES4 strict without "with" could be explained in terms of static scope analysis without the concept of a dynamic scope chain. Simpler formalism enables better metaprogramming tools.
Users choosing subsets from full implementations do not simplify the implementation space,
Agreed. That's not the point.
and too many subsets make a mess of the spec and its implementation.
Agreed. Let's continue to try to reconcile "strict" and "cautious". And let's avoid adding any more choices to the menu in the ES3.1/ES4 timeframe.
What users -- as opposed to es3.x-discuss participants who may have different and seemingly contradictory goals -- have asked for these subsets?
To my ears, I've been quite amazed at how convergent our thinking has been on these matters. What disagreements and contradictions do you see among the es3.x-discuss participants?
The complexity of multiple subset modes, or whatever they are, will increase the odds of interoperation bugs in the wild.
Agreed.
There goes the QA budget :-/.
This is a recipe for testing costs driving everyone away from all but one mode, probably the one that's web-compatible.
Anyone interested in software quality is horrified by the current state of JavaScript, and should welcome the constraints enforced by a strict or cautious subset.
Regarding whether or not ES4 is in agreement...Before anybody can agree or not to anything, there first has to be a proposal on the table to consider. That's what I'm trying to put on the table.
How many proposed subsets or modes are there?
Until/unless we resolve the strict vs cautious incompatiblities, there's:
- ES3.1 < ES4
- ES3.1 cautious < ES3.1
- ES4 strict < ES4
- ES4 cautious < ES4
- ES3.1 cautious < ES4 cautious
If we do resolve these incompatibilities, then it reduces to the earlier four-bullet list above.
But for the outer parens and the subset declaration itself, JSON is in fact a subset of all the above languages. Whether the "use subset X" directive is able to express this or not, I don't really care. But it remains a vivid example of the utility of knowing that code is constrained to a subset of the language with a simpler and less dangerous semantics.
On Jul 1, 2008, at 10:55 AM, Mark S. Miller wrote:
If the ES4 and ES3.1 folks can agree on their respective meanings of the subset named "strict", such that
- ES3.1 < ES4
- ES3.1 strict < ES3.1
- ES4 strict < ES4
- ES3.1 strict < ES4 strict
then we'd only need the two subset names needed for these distinctions. Even then, making the syntax change from
"use strict";
to
"use subset strict";
still seems like a good idea to me.
I'm not in favor of uniformity over usability, especially if for
whatever reason we do not add subsets to future standards. 'use
strict' is short and sweet, familiar from Perl and other languages,
and IMHO much better than the pedantic (sorry, but it fits :-P) 'use
subset strict'.
If we do standardize more subsets, we can cross the longer bridge then.
This is a small point, but it highlights tension between usability
("read" as well as "write", I claim) and "completeness" (for want of
a better word). Anyway the ES4 pragma syntax prefers one-word terms,
although there have been ad-hoc two-word forms.
At the last F2F, Lars and Brendan raised the possibility that they might like to define stricter modes in successors of ES4. The notion of named subsets would seem to accommodate that. Brendan & Lars, in the absence of named subsets, how would you introduce such stricter modes in that future? What would you propose?
Mainly what I recall we talked about at that meeting was simply the
idea that (ESn+1 strict could be < ESn strict) -- that one could make
strict mode stricter over time. Adding 'use stricter' was not a
serious proposal, IIRC.
I recall also we did not want strict mode to be part of the MIME-
typed version parameter, or the version to be selectable by a pragma.
So the only issue was upgrading from ESn to ESn+1 and facing a
stricter strict mode. Standard mode would be highly backward
compatible, with unpredictable and slow obsolescence leading to some
amount of deadwood shedding over time -- maybe.
Of the above desired subset relationships, the problematic one is the last bullet above.
To wit:
- ES3.1 strict < ES4 strict
Agreed.
The outstanding disagreements are variable arity strict functions and "with". Hopefully we can resolve these in Oslo. In the meantime, to avoid stepping on each other's toes, we've renamed "ES3.1 strict" to "cautious". That's all that's going on. I'm surprised there's such a fuss.
Probably you are right, and there's too much fuss. On the other hand,
we now have a straw "cautious" mode to argue about. It would be
better to get agreement on strict before expending the effort and
seeming to up the "subset violation" ante (not entirely a matter of
perception; every documented and jargonized change tends to take on a
life of its own, and fight for survival).
That's my point -- profiled standards may pretend that implementations can pick and choose among profiles to support, but on the web's winner-take-all network effects force every implementation to agree on the full standard.
Aren't you, Crock, and the rest of us vigorously agreeing?
I am pretty sure we agree on a lot -- more than you might think ;-).
But let's be clear on what I am arguing against: multiple subsets
taking precious standards-making bandwidth, adding to the complexity
(even temporarily) of the work within ES3.1 and ES4, and of course
between them.
Yes, subsets are useful -- so useful there are a great many out
there. No, we should not try to standardize them all, or invent
overlong syntax with which to select them, if we are only talking
about strict mode.
Users choosing subsets from full implementations do not simplify the implementation space,
Agreed. That's not the point.
But that's my point, one of them: the specs are in large part for the
benefit of implementors, to make interoperation a more likely
outcome. Since web browser implementations have to handle the whole
language (and then some: quirks and deprecated features may remain
for a long while), the spec should not overreach for subsets at the
expense of clarity, compatibility, and completeness.
Complexity due to unnecessary subset additions, or mooting, or future- proofing, is therefore not welcome, all else equal -- IMHO.
and too many subsets make a mess of the spec and its implementation.
Agreed. Let's continue to try to reconcile "strict" and "cautious". And let's avoid adding any more choices to the menu in the ES3.1/ES4 timeframe.
I agree completely -- good to read this.
What users -- as opposed to es3.x-discuss participants who may have different and seemingly contradictory goals -- have asked for these subsets?
To my ears, I've been quite amazed at how convergent our thinking has been on these matters. What disagreements and contradictions do you see among the es3.x-discuss participants?
I was quoting Allen there. He wrote (two messages up on this thread):
"This is an interesting exercise because I'm trying to find a path
that is tolerable for a number of different individuals who seem to
have different and sometimes seemingly contradictory goals: ...." I
didn't think he was writing only about ES4 participants. Among ES3
participants, is there strong agreement on disabling automatic
semicolon insertion in 3.1's strict mode?
This is a recipe for testing costs driving everyone away from all but one mode, probably the one that's web-compatible.
Anyone interested in software quality is horrified by the current state of JavaScript, and should welcome the constraints enforced by a strict or cautious subset.
I hope we can standardize a strict mode that increases software
quality. That's why I'm concerned, along with Lars, about strict mode
ruling out syntax (such as automatic semicolon insertion) that much
JS on the web relies on without noticeable quality problems. We want
a strict mode that will be both useful and usable at low incremental
('use strict' per block or script) cost.
BTW, we've had arguments in the past with Ajax library authors, who
write high quality code, about things like Firefox 1's "deprecated
with" warnings. Not everyone agrees with your idea of what is
horrible, and observed software quality problems do not correlate so
clearly to all those 3.1-unstrict bĂȘtes noires.
[+es4-discuss]
On Mon, Jun 30, 2008 at 7:37 PM, Maciej Stachowiak <mjs at apple.com> wrote:
Whether or not it's a good idea, given "use subset JSON" as a recognized/enforced subset directive, one could trivially implement JSON.stringify(str) in terms of
I think I agree. In any case, I agree that JSON is not by itself a compelling case for "use subset X". My point is only that JSON is a huge counter-example to Brendan's statement that "profiled (subsetted) standards are meaningless to harmful on the web". JSON was defined as an enforced subset of JavaScript, and it has been extraordinarily helpful to the web.