[whatwg] Cryptographically strong random numbers
Yes, we aspire to standardize a good RBG as an "upgrade" to Math.random -- obviously a new name would be needed, but with docs and evangelization we would hope to steer people away from that old p.o.s. I copied from Java in 1995 (quote unquote).
See strawman:random-er for the mostly-aspirational page. As Boris says, this is a core language strawman proposal, not something we want browser-only (think nodejs.org).
Yes, we aspire to standardize a good RBG as an "upgrade" to Math.random -- obviously a new name would be needed, but with docs and evangelization we would hope to steer people away from that old p.o.s. I copied from Java in 1995 (quote unquote).
See strawman:random-er for the mostly-aspirational page. As Boris says, this is a core language strawman proposal, not something we want browser-only (think nodejs.org).
[+benl, +shabsi, +frantz, +daw]
On Sun, Feb 13, 2011 at 6:37 PM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 2/13/11 8:22 PM, Adam Barth wrote:
It seems likely that window.crypto will continue to grow more quality cryptographic APIs, not all of which will be appropriate at the ECMAScript level.
Sure; the question is whether this particular API would be more appropriate at the language level. Or more to the point, if the language plans to grow it anyway, do we need two APIs for it?
It's worth at least checking with the ES folks whether they plan to add a API like this (something that fills in an array of bytes with cryptographically strong random values) in any sort of short-term timeframe.
Thanks for checking. The answer is yes. I'm scheduled to start a discussion of strawman:random-er at either the
upcoming March or May meetings. Currently random-er is on the agenda for May but I may swap it into March. As you can tell, this page is currently only a placeholder.
I have also talked just a bit with Shabsi Walfish, Ben Laurie, David Wagner, and Bill Frantz, all cc'ed, about the possibility of a real crypto API for EcmaScript. With the sole exception of randomness, I believe that we should handle this the same way we're handling i18n -- as a separate working group within tc39 (the EcmaScript committee) working on a separate standard library in a separate standards document. The reason to make an exception for random-er is that it's the only fundamental omission. Given a decent random-er, everything else can be done initially in JS.
On Mon, Feb 14, 2011 at 2:47 AM, Adam Barth <w3c at adambarth.com> wrote:
That's a pretty long time horizon. You're going to start discussing it in 2-4 months? That seems a bit overwrought for what amounts to four lines of code.
The committee meets once every two months. Between meetings, we discuss things on es-discuss, so please do.
What are the four lines of code?
On 2/14/11 11:31 AM, Mark S. Miller wrote:
On Mon, Feb 14, 2011 at 2:47 AM, Adam Barth <w3c at adambarth.com <mailto:w3c at adambarth.com>> wrote:
That's a pretty long time horizon. You're going to start discussing it in 2-4 months? That seems a bit overwrought for what amounts to four lines of code.
For what it's worth, it's a lot more than 4 lines of code in Gecko, because we don't have an existing source of strong randomness in Spidermonkey. It'd be about that much code in the DOM, where we could pass the buck to NSS, but for Spidermonkey it would take a good bit more work.
On Feb 14, 2011, at 8:40 AM, Boris Zbarsky wrote:
On 2/14/11 11:31 AM, Mark S. Miller wrote:
On Mon, Feb 14, 2011 at 2:47 AM, Adam Barth <w3c at adambarth.com <mailto:w3c at adambarth.com>> wrote:
That's a pretty long time horizon. You're going to start discussing it in 2-4 months? That seems a bit overwrought for what amounts to four lines of code.
For what it's worth, it's a lot more than 4 lines of code in Gecko, because we don't have an existing source of strong randomness in Spidermonkey. It'd be about that much code in the DOM, where we could pass the buck to NSS, but for Spidermonkey it would take a good bit more work.
Not really. We use callback-style APIs to delegate such chores as l10n to the embedding, and the same can be done for randomness-hunting. It's just interfacing, or buck-passing -- but I'm still wondering what magical four lines of code provide useful lower bounds on bits of entropy in WebKit. Is this just non-interoperable delegation to the host OS?
Adam, I like moving fast (but beware, that's how JS and the DOM were created), but we need a cross-browser spec of some kind, even if we all add crypto.getRandomValues quickly. The original y2k-buggy, cloned-from-java.util.Date JS Date object of 1995 required a lot of work for ES1 to remove OS dependencies that made for interop hell.
For a core language standard, we would want some kind of minimum quality guarantee on the randomness.
Quick thought on the name: to avoid vagueness and the "why not wider int or even float element type" questions that have already come up, call the method getRandomBytes.
On Feb 14, 2011, at 11:31 AM, Adam Barth wrote:
What's non-interoperable about filling an ArrayBuffer with random bytes? I'm not sure I understand your question.
The question is what OSes fail to provide enough random bits these days.
This may just be a sanity-checking step (my sanity, at least; I lived through the great entropy hunt of 1995; www.cs.berkeley.edu/~daw/papers/ddj-netscape.html [link courtesy dwagner]).
However, I'm disinclined to wait on the basic best-effort PRNG for that to happen.
What would you be waiting for? Ignoring Ecma, just adding code to WebKit doesn't make a cross-browser standard. Never mind Firefox (we'll do something soon enough to match). What about IE?
It seems to me we (whatwg members, w3c members; browser vendors in general) need something more than IDL in the way of a spec.
I added support for all the integer ArrayBuffer types, so getRandomBytes isn't a particularly accurate name.
Ok, that seems fine (now that I have read your patch -- thanks for the link!).
Everyone,
Before posting on this thread, please subscribe at < mail.mozilla.org/listinfo/es-discuss> to es-discuss. The es-discuss
list drops posts by non-subscribers, and thus seems to have dropped posts by Adam Barth and Shabsi Walfish that were sent after es-discuss was added to the thread. Adam and Shabsi, could you please subscribe and re-post? Thanks.
As has already been discussed, since these issues are general EcmaScript issues, applying not just in the browser but in many places JavaScript is run (e.g., nodejs), I suggest that this be the last message on this thread cross-posted to whatwg. Unless someone has a reason for retaining whatwg in the addressee list, I will drop it from all my further postings on this thread.
Re-posting for es
Re-posting for es
On Mon, Feb 14, 2011 at 12:01 PM, Shabsi Walfish <shabsi at google.com> wrote:
On Mon, Feb 14, 2011 at 11:31 AM, Adam Barth <w3c at adambarth.com> wrote:
On Mon, Feb 14, 2011 at 8:31 AM, Mark S. Miller <erights at google.com>wrote:
On Mon, Feb 14, 2011 at 2:47 AM, Adam Barth <w3c at adambarth.com> wrote:
That's a pretty long time horizon. You're going to start discussing it in 2-4 months? That seems a bit overwrought for what amounts to four lines of code.
The committee meets once every two months. Between meetings, we discuss things on es-discuss, so please do.
What are the four lines of code?
trac.webkit.org/browser/trunk/Source/WebCore/page/Crypto.cpp
On Mon, Feb 14, 2011 at 8:40 AM, Boris Zbarsky <bzbarsky at mit.edu> wrote:
On 2/14/11 11:31 AM, Mark S. Miller wrote:
On Mon, Feb 14, 2011 at 2:47 AM, Adam Barth <w3c at adambarth.com <mailto:w3c at adambarth.com>> wrote:
That's a pretty long time horizon. You're going to start discussing it in 2-4 months? That seems a bit overwrought for what amounts to four lines of code.
For what it's worth, it's a lot more than 4 lines of code in Gecko, because we don't have an existing source of strong randomness in Spidermonkey. It'd be about that much code in the DOM, where we could pass the buck to NSS, but for Spidermonkey it would take a good bit more work.
Similarly, the WebKit implementation just passes the buck to the WTF layer, which already knows how to fill a void* with n cryptographically random bytes. The implementation will be similarly trivial in most applications that link with OpenSSL or NSS or that run on Mac, Linux, or Windows.
On Mon, Feb 14, 2011 at 10:05 AM, Brendan Eich <brendan at mozilla.org> wrote:
On Feb 14, 2011, at 8:40 AM, Boris Zbarsky wrote:
On 2/14/11 11:31 AM, Mark S. Miller wrote:
On Mon, Feb 14, 2011 at 2:47 AM, Adam Barth <w3c at adambarth.com <mailto:w3c at adambarth.com>> wrote:
That's a pretty long time horizon. You're going to start discussing
it in 2-4 months? That seems a bit overwrought for what amounts to four lines of code.
For what it's worth, it's a lot more than 4 lines of code in Gecko, because we don't have an existing source of strong randomness in Spidermonkey. It'd be about that much code in the DOM, where we could pass the buck to NSS, but for Spidermonkey it would take a good bit more work.
Not really. We use callback-style APIs to delegate such chores as l10n to the embedding, and the same can be done for randomness-hunting. It's just interfacing, or buck-passing -- but I'm still wondering what magical four lines of code provide useful lower bounds on bits of entropy in WebKit. Is this just non-interoperable delegation to the host OS?
What's non-interoperable about filling an ArrayBuffer with random bytes? I'm not sure I understand your question.
Adam, I like moving fast (but beware, that's how JS and the DOM were
created), but we need a cross-browser spec of some kind, even if we all add crypto.getRandomValues quickly. The original y2k-buggy, cloned-from-java.util.Date JS Date object of 1995 required a lot of work for ES1 to remove OS dependencies that made for interop hell.
In this case, the API is several order of magnitude simpler than Date. The choices to make roughly boil down to the name of the function, which types to support, and what sorts of exceptions to throw in which error cases.
For a core language standard, we would want some kind of minimum quality guarantee on the randomness.
I'm not sure that's possible without either allowing the API to block for arbitrary periods of time or to fail. Assuming we want a non-blocking, non-failing API, we can't make any guarantees about the quality of the randomness. In that sense, crypto.getRandomValues is "best effort."
I don't think its necessary to use "best effort" if you want something that is effectively non-blocking... what you need to do is preemptively gather a small amount of real entropy on startup (say, 160 bits worth) and then use a cryptographic PRNG to generate whatever you need based on that seed... the PRNG would be pretty darn fast, so you wouldn't have to block forever after. Only the initial seeding needs to be done via a call to the OS that might block, and if that can be done asynchronously at startup time odds are good it will be a non-issue on most all platforms. Whatever you do, my advice is to combine entropy in the seed pool via XOR (assuming you do any entropy combining), since various combinations of append/prepend/etc. always
Sorry, as David pointed out to me, this was not stated correctly... What I meant to say is, hash the entropy sources you are combining out to 160 bits, then XOR the 160 bit outputs together in the pool. I think David may still disagree with this advice though, so I'll leave it to him to comment. ;)
Shabsi
On Feb 14, 2011, at 12:26 PM, Adam Barth wrote:
On Mon, Feb 14, 2011 at 11:56 AM, Brendan Eich <brendan at mozilla.org> wrote: On Feb 14, 2011, at 11:31 AM, Adam Barth wrote:
What's non-interoperable about filling an ArrayBuffer with random bytes? I'm not sure I understand your question. The question is what OSes fail to provide enough random bits these days.
This may just be a sanity-checking step (my sanity, at least; I lived through the great entropy hunt of 1995; www.cs.berkeley.edu/~daw/papers/ddj-netscape.html [link courtesy dwagner]).
Somehow OpenSSL and NSS seem to solve that problem given that cryptographic entropy is required to make HTTPS secure. I'm certainly willing to believe I've goofed it up, but I suspect it's not that much of a limitation these days.
I'm happy if the answer is "all OSes, mobile and desktop, provide enough high quality randomness". Looking for data if anyone has it.
However, I'm disinclined to wait on the basic best-effort PRNG for that to happen.
What would you be waiting for? Ignoring Ecma, just adding code to WebKit doesn't make a cross-browser standard. Never mind Firefox (we'll do something soon enough to match). What about IE?
Given that we've missed IE9, we're talking about IE10 at the earliest. If history is a guide (which it might not be), that means we're talking about something 2 years in the future.
Oh, who knows what any vendor will do under modern competitive pressure. Let's not assume :-/. My point is: to engage all the major browser vendors, we need a spec, not four-line (net of #ifdefs) patches.
Here is another spec-worthy issue that IDL can't capture (and if we choose one way, can't even express): the #ifdefs raise the question (which David Wagner and I discussed briefly in private correspondence, and which I see came up on the whatwg list already) of whether it is better to provide a throwing stub when the OS randomness configury is not enabled, or better not to provide an object-detectible method at all?
I'm inclined toward not providing a method that always throws. I've seen too much fool's gold mislead developers. Bugs and throwing stubs underlying detectible methods have led web developers to not just object-detect, but unit-test online! (See www.slideshare.net/jeresig/the-dom-is-a-mess-yahoo .) But testing detectible methods for correctness further bloats downloaded JS and slows first-run execution.
Ok. I'll write up a spec later today.
Thanks.
On Mon, Feb 14, 2011 at 12:15 PM, Mark S. Miller <erights at google.com> wrote: As has already been discussed, since these issues are general EcmaScript issues, applying not just in the browser but in many places JavaScript is run (e.g., nodejs), I suggest that this be the last message on this thread cross-posted to whatwg. Unless someone has a reason for retaining whatwg in the addressee list, I will drop it from all my further postings on this thread.
Actually, that is precisely what we're discussing. My suggestion is that we do both: browsers implement a simple DOM-based API today that handles the basic arc4random-like use case and TC39 go through whatever multi-month (year?) process it likes and spec out whatever all-sing, all-dance crypto library it likes.
There seems to be some spin here. What does "today" mean, and why the loaded "multi-month (year?) process" and "all-sing[ing]" etc. imputations to Ecma? I hope you are not describing how quickly you can hack on WebKit code, because while I can hack quickly on Mozilla code, that does not set the pace of a standard, never mind make a new feature available cross-browser to web developers.
While it indeed takes years to produce new Ecma (ISO) specs, we on TC39 support early prototyping of harmonious proposals, so web developers can start using such candidate features. But for this to work we need a hand-shake on what is harmonious.
If the idea is to promulgate a de-facto standard via Chrome and let other browsers reverse-engineer it, that can work, but it could backfire.
Extending the old window.crypto object we added ages ago at Netscape may be a good idea and a faster route to a high-quality cross-browser RBG. I won't argue about that. My objection here is to your choice of words and the way they might foreclose better cooperation among vendors: such an RBG API is not done "today", and it needs more than just one implementation to be a standard other browsers will implement.
While we're waiting for Adam to subscribe to es-discuss and repost his messages on this thread, this one seems worth pre-posting.
Changes needed for this to become an EcmaScript strawman:
Replace references to ArrayBufferView with appropriate abstractions from < strawman:binary_data>.
Replace WebIDL as a specification language with a JavaScript based API spec. Get rid of the dependence on "window". Probably avoid introducing a new global "crypto" as well, though we can argue about that.
Are there any other lurking browser dependencies in Adam's spec that we need to scrub away?
On Mon, Feb 14, 2011 at 5:08 PM, Adam Barth <w3c at adambarth.com> wrote:
On Mon, Feb 14, 2011 at 12:49 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Feb 14, 2011, at 12:26 PM, Adam Barth wrote:
Ok. I'll write up a spec later today.
Thanks.
Done: wiki.whatwg.org/wiki/Crypto
Feedback appreciated.
If insufficient cryptographically random values are available,
getRandomValues does not alter array and throws a NOT_SUPPORTED_ERR
I'm not sure if this means "if you're using /dev/random and it would block, throw", or "if the amount of entropy in the PRNG's entropy pool is low, throw", but they both seem hard to deal with from scripts. There's no way to know when to try again, and most applications wanting secure PRNGs don't need this. Even ssh-keygen seems to simply use /dev/urandom without worrying about it returning low-entropy randomness.
I think it makes more sense to imply /dev/urandom's behavior: always return data, even if the entropy pool is low. If there's a need for randomness with that stronger guarantee of entropy, that seems like it would want an asynchronous API in order to wait for entropy (akin to /dev/random).
It'd be nice if there was at least a way to explicitly detect if you were getting "weaker" entropy... In linux, for example, there is a /proc filesystem entry (/proc/sys/kernel/random/entropy_avail) that indicates how much entropy is available in the pool.
Shabsi
This depends on what you consider to be the basic use case. Generating long-lived cryptographic keys absolutely requires high quality entropy... if you are only generating short-lived authenticators (that are not used for encryption) then you could get away with weaker entropy. You will get the most mileage out of this feature if it can be used to generate encryption keys, or long-lived signing keys.
Shabsi
[adding Cameron McCormack, Web IDL editor, to the discussion]
Le 14/02/2011 22:29, Adam Barth a écrit :
On Mon, Feb 14, 2011 at 12:49 PM, Brendan Eich <brendan at mozilla.org> wrote:
On Feb 14, 2011, at 12:26 PM, Adam Barth wrote:
On Mon, Feb 14, 2011 at 11:56 AM, Brendan Eich <brendan at mozilla.org>wrote: Extending the old window.crypto object we added ages ago at Netscape may be a good idea and a faster route to a high-quality cross-browser RBG. I won't argue about that. My objection here is to your choice of words and the way they might foreclose better cooperation among vendors: such an RBG API is not done "today", and it needs more than just one implementation to be a standard other browsers will implement I chose my words poorly. What I was trying to communicate is that I'd love to see ECMAScript grow a beautiful crypto library that included lots of useful tools, including HMACs, stream ciphers, PRNGs, etc. TC39 is in a massively better position to do that work than the whatwg. However, at this point in time, adding a simple source of cryptographic random numbers to window.crypto is much simpler proposition (e.g., on the order of one day of work).
This last point is what worried me and encouraged me to question what was the best place to discuss the addition of a strong RNG. (lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-February/030442.html)
The WHATWG can indeed start a proposition which could be written in one day. The ECMAScript can also decide to grow a "beautiful crypto library". But when all of that will be standardized and implemented, web developers will have 2 librairies to generate strong RNG. They need only one. The problem here is that the main use case overlap in both comitee/working group: "getting cryptographically strong random numbers".
This is actually the exact same situation that happens with concurrency. WHATWG came out with WebWorkers and ECMA TC-39 is working on Vats. Once again, overlapping of use cases. Fortunately, on the ECMA proposal (strawman:concurrency), is written "We intend that WebWorkers can be implemented in terms of vats and vice versa.". But if it wasn't the case, web developers would have two different visions of concurrency (if Vats are standardized and implemented, they are going to have two tools anyway).
I think I agree with earlier Brendan Eich's idea (lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-February/030471.html): "It seems to me we (whatwg members, w3c members; browser vendors in general) need something more than IDL in the way of a spec." (I understand "IDL" as "the WebIDL spec" (www.w3.org/TR/WebIDL)) and I would even add the TC-39 comittee.
I'm not against innovation, prototyping. I do not really care if standardization takes one day or ten years. However, as a web developer, I do not need several tools to solve the exact same use case. We've seen how hurtful to the web was the IE DOM (innerText VS W3C textContent) or IE event model (attachEvent without control on bubbling/capture VS W3C addEventListener). Let's not make the same mistake!
Each time the same use case is solved twice, people write JavaScript libraries.
If the intent is to get ECMAScript implementations to quickly provide this function then I would suggest that it be specified only in terms of things that are already in ES5. That would preclude use of anything from the Harmony binary data strawman.
I don't particularly see why a "binary array" is needed in this situations. A regular ES5 array is perfectly capable of holding the numeric results. And why overwrite the elements of an existing array? Why not just creating a new Array and use the argument to specify the desired length?
I don't understand the use cases that would justify the various integer length options for the random elements.. One reasonable size seems fine. I would probably go with 16 bit units in recognitions that many JavaScript implementations have optimized small integer sizes < 32 bits. If you want 8 bit or 32 bit values, assemble them yourself out of the 16 bits values.
Alternative, if you are are attached to pure binary data. The function could return a string value, as ECMAScript strings are really immutable vectors of 16-bit unsigned values.
In either case, I would attach the method to Array (or string) if we went down that path:
Array.randomValues = function randomValues (len) { //built-in //Return an Array of ToUInt32(len) elements where each element is a number in the range 0 to 65535 // The values of the elements must be generated using a strong crypto PRNG // If ToUInt32(len) random values are not available, throw a RangeError Exception
alternatively, String.randomValues = function randomValues (len) { //built-in //Return an string value consisting of ToInteger(len) characters. // The character code of the element of the string must be generated using a strong crypto PRNG // If ToInteger(len) is not a positive integer or if that many random values are not available, throw a RangeError Exception
On Mon, Feb 14, 2011 at 5:46 PM, Shabsi Walfish <shabsi at google.com> wrote:
This depends on what you consider to be the basic use case. Generating long-lived cryptographic keys absolutely requires high quality entropy... if you are only generating short-lived authenticators (that are not used for encryption) then you could get away with weaker entropy. You will get the most mileage out of this feature if it can be used to generate encryption keys, or long-lived signing keys.
OpenSSL gets randomness for generating keys by reading /dev/urandom. It doesn't seem to do any other tricks, like reading /proc/sys/kernel/random/entropy_avail. That at least suggests it's sufficient for securely generating keys, without more complex APIs like exposing the amount of entropy that was available.
OpenSSL is not exactly a reliable source of cryptographic best practices. :) In any case, see here linux.die.net/man/4/urandom :
When read, the /dev/random device will only return random bytes within the estimated number of bits of noise in the entropy pool. /dev/random should be suitable for uses that need very high quality randomness such as one-time pad or key generation. When the entropy pool is empty, reads from * /dev/random* will block until additional environmental noise is gathered.
A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current non-classified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.
To be honest, I'm surprised if even OpenSSL reads from /dev/urandom instead of /dev/random.
Shabsi
On 2/14/11 at 2:46 PM, shabsi at google.com (Shabsi Walfish) wrote:
This depends on what you consider to be the basic use case. Generating long-lived cryptographic keys absolutely requires high quality entropy... if you are only generating short-lived authenticators (that are not used for encryption) then you could get away with weaker entropy. You will get the most mileage out of this feature if it can be used to generate encryption keys, or long-lived signing keys.
[Greetings to old friends and new. And I hope I've properly subscribed to all the right lists...]
I have a big problem with concepts such as "using up entropy", and "high quality entropy is needed." I belong to the school that says, "If I have 160 unguessable bits and can keep them secret, I can stretch them and remain secure for ever." Now there are some issues with this statement:
-
I don't really trust my stretching algorithms, based on secure hashes, to not leak some tiny fraction of the unguessable bits.
-
Keeping any secret is difficult.
-
Getting unguessable bits is a hard problem.
-
160 may be too small.
Because of these issues, modern secure random number generators add batches of new unguessable bits from time to time.
This line of thinking leads me to say that /dev/urandom, and its Windows cousin, is good enough for any practical use. Ensuring that the seed for /dev/urandom is indeed unguessable is a problem for the OS, not the language run-time.
For ECMAscript/Javascript or whatever this group calls the language, I suggest:
(1) Build a routine that returns /dev/urandom data to the caller. Implement this routine fairly quickly.
(2) At a more leisurely pace, build a crypto API and implement it in the language. After the usability bugs are worked out of the API, standardize it. If more performance is needed, perhaps a platform dependent issue, build higher speed implementations of the standard.
[Historically, a number of crypto-based APIs have failed because developers could not figure out how to use them. Getting the usability right is probably the hardest part of designing the API.
Cheers - Bill
Bill Frantz |The nice thing about standards| Periwinkle (408)356-8506 |is there are so many to choose| 16345 Englewood Ave www.pwpconsult.com |from. - Andrew Tannenbaum | Los Gatos, CA 95032
On Mon, Feb 14, 2011 at 3:49 PM, Bill Frantz <frantz at pwpconsult.com> wrote:
On 2/14/11 at 2:46 PM, shabsi at google.com (Shabsi Walfish) wrote:
This depends on what you consider to be the basic use case. Generating
long-lived cryptographic keys absolutely requires high quality entropy... if you are only generating short-lived authenticators (that are not used for encryption) then you could get away with weaker entropy. You will get the most mileage out of this feature if it can be used to generate encryption keys, or long-lived signing keys.
[Greetings to old friends and new. And I hope I've properly subscribed to all the right lists...]
I have a big problem with concepts such as "using up entropy", and "high quality entropy is needed." I belong to the school that says, "If I have 160 unguessable bits and can keep them secret, I can stretch them and remain secure for ever." Now there are some issues with this statement:
I don't disagree with this -- however, I would say the problem is that urandom may not return 160 "unguessable" bits to start with. Unguessable, of course, actually means "extremely difficult to predict with better than probability 1/2, conditioned on all previously observable information (such as approximate system time, process ids, etc.)". In any case, I was simply suggesting it would be nice to make available information about the quality of the entropy being returned by /dev/urandom, which is either going to be good (because the entropy pool was full) or possibly not good (perhaps because the entropy pool hasn't ever filled up yet).
Shabsi
On Feb 14, 2011, at 1:29 PM, Adam Barth wrote:
On Mon, Feb 14, 2011 at 12:49 PM, Brendan Eich <brendan at mozilla.org> wrote: On Feb 14, 2011, at 12:26 PM, Adam Barth wrote:
On Mon, Feb 14, 2011 at 11:56 AM, Brendan Eich <brendan at mozilla.org> wrote: On Feb 14, 2011, at 11:31 AM, Adam Barth wrote:
What's non-interoperable about filling an ArrayBuffer with random bytes? I'm not sure I understand your question. The question is what OSes fail to provide enough random bits these days.
This may just be a sanity-checking step (my sanity, at least; I lived through the great entropy hunt of 1995; www.cs.berkeley.edu/~daw/papers/ddj-netscape.html [link courtesy dwagner]).
Somehow OpenSSL and NSS seem to solve that problem given that cryptographic entropy is required to make HTTPS secure. I'm certainly willing to believe I've goofed it up, but I suspect it's not that much of a limitation these days.
I'm happy if the answer is "all OSes, mobile and desktop, provide enough high quality randomness". Looking for data if anyone has it.
As far as I can tell, /dev/urandom works great on all modern Unix-derived operating systems (e.g., Mac, Linux, BSD, iOS, Android). Windows since Windows 2000 has CryptGenRandom:
msdn.microsoft.com/en-us/library/aa379942(v=vs.85).aspx
Are there other operating systems you're worried about?
Not for Mozilla (see below about BeOS). So as not to put up "stop energy", let me say this is great news, and I hope we're done.
In parallel, hearing "all clear" from Ecma and W3C members who care about Symbian, Nokia's Linux-based stuff, anything not modern Windows (older mobile stuff), or the various proprietary real-time/embedded OSes would be helpful.
Ok. We can change the API to be like that, if you like. In WebKit, USE(OS_RANDOMNESS) is defined on virtually every port. There might be some ports that aren't covered (e.g., BeOS) though.
Heh, Mozilla has a BeOS 'port, along with other OS/CPU targets that are not "tier 1" (where tier 1 has /dev/urandom or CryptGenRandom). I think the Amiga port died years ago ;-). But these can do without. The question is what the mechanics of "doing without" should be (runtime exception or no method detected).
There seems to be some spin here. What does "today" mean, and why the loaded "multi-month (year?) process" and "all-sing[ing]" etc. imputations to Ecma? I hope you are not describing how quickly you can hack on WebKit code, because while I can hack quickly on Mozilla code, that does not set the pace of a standard, never mind make a new feature available cross-browser to web developers.
Maybe I misunderstood TC39's intentions. My understanding is that your aspirations include a full-featured crypto library, e.g., at the level of complexity of OpenSSL rather than at the complexity of arc4random. Certainly designing and implementing such a feature is a longer-term prospect.
Mark suggested such a program, and I like the idea (as you clearly do, cited below), but TC39 as a committee has not bought into it yet.
Putting aside the idea of a larger crypto library TC39 task group, Mark did make a special case for the RBG in the core language, since Math.random is in the core language, cannot be removed, yet is also a footgun.
This has been TC39's position: whatever we do with a separate task group for a crypto library, we have a Harmony agenda item, represented by that
place-holder proposal.
While it indeed takes years to produce new Ecma (ISO) specs, we on TC39 support early prototyping of harmonious proposals, so web developers can start using such candidate features. But for this to work we need a hand-shake on what is harmonious.
If the idea is to promulgate a de-facto standard via Chrome and let other browsers reverse-engineer it, that can work, but it could backfire.
If that were my intention, we wouldn't be having this discussion.
I think you're probably right that whatwg could get some crypto.getRandomValues spec together faster. For one, you've already done a bunch of work in that context!
But I see the situation as fluid, no matter what standards body claims jurisdiction. If we prototype and test something successfully, then (with name changes if necessary) it could be put into either standard process. Neither process is "fast", since for IPR release the whatwg one still must flow into w3c.
So my point is that nothing in the current standards bodies necessitates that an RBG proto-spec appears "today" in the whatwg context, vs. years from now in Ecma. Maybe we should have two APIs, but as David Bruant just argued, wouldn't it be better to have only one?
The Chrome idea is not only a matter of your intentions. It could happen no matter what you intend, and that could be a good thing, too -- in the best case. I've promulgated de-facto standards, some of which did not suck. I did this during the IE monopoly era, to keep JS moving even though its standard was stagnating.
However, not all of those de-facto standards were as good as I'd like, or as they would have been with some high-powered peer review and collaboration early in their lives. So again, I am less concerned with standards body end-games or even jurisdiction, and more concerned with getting the particulars of the spec "more right", sooner rather than later.
With the (more-) right spec, and with good feedback and ongoing relations among the implementors and standards folks, I think you should implement in Chrome and drive a de-facto standard ahead of the de-jure one. So there! ;-)
Extending the old window.crypto object we added ages ago at Netscape may be a good idea and a faster route to a high-quality cross-browser RBG. I won't argue about that. My objection here is to your choice of words and the way they might foreclose better cooperation among vendors: such an RBG API is not done "today", and it needs more than just one implementation to be a standard other browsers will implement.
I chose my words poorly. What I was trying to communicate is that I'd love to see ECMAScript grow a beautiful crypto library that included lots of useful tools, including HMACs, stream ciphers, PRNGs, etc. TC39 is in a massively better position to do that work than the whatwg. However, at this point in time, adding a simple source of cryptographic random numbers to window.crypto is much simpler proposition (e.g., on the order of one day of work).
Again this switches categories from standardization to implementation. It is a day's work to implement in Gecko as well as WebKit, and I claim at either the core-language or DOM window.crypto layer. That's all "easy" given the C++ or C RBG library functionality.
What will take more than a day (see also Allen's post) is agreeing on API form and function in enough detail to gain traction with all the major browser implementors. But not too many days, I hope.
On Mon, Feb 14, 2011 at 6:43 PM, Shabsi Walfish <shabsi at google.com> wrote:
OpenSSL is not exactly a reliable source of cryptographic best practices. :) In any case, see here linux.die.net/man/4/urandom :
No single implementation is; neither are Linux manpages. The question is whether there are security issues when generating long-term keys from a secure PRNG (RC4, Yarrow, Fortuna) from an entropy pool that's been seeded but exhausted. I suspect that question has been examined at great length by others in the past, so I doubt there's new ground for us to cover on this. It would be interesting if anyone knows of any competent analysis of this question (preferably in a form written for non-cryptographers).
In any case, an API which returns random data with a guarantee of entropy inherently must block, like /dev/random does. That implies an asynchronous API, taking a callback which is called when the requested data is available. Even if that's ultimately wanted, it would be a separate API.
(Of course, if that API is created later, then it should be similar to this one--an asynchronous version of this synchronous API. I can think of some minor speed bumps to making an async version of this API--you don't want to write to the array asynchronously, while other code is running--but nothing unreasonable.)
On Mon, Feb 14, 2011 at 4:15 PM, Glenn Maynard <glenn at zewt.org> wrote:
On Mon, Feb 14, 2011 at 6:43 PM, Shabsi Walfish <shabsi at google.com> wrote:
OpenSSL is not exactly a reliable source of cryptographic best practices. :) In any case, see here linux.die.net/man/4/urandom :
No single implementation is; neither are Linux manpages. The question is whether there are security issues when generating long-term keys from a secure PRNG (RC4, Yarrow, Fortuna) from an entropy pool that's been seeded but exhausted. I suspect that question has been examined at great length by others in the past, so I doubt there's new ground for us to cover on this. It would be interesting if anyone knows of any competent analysis of this question (preferably in a form written for non-cryptographers).
Hmm... if there is a guarantee that /dev/urandom was successfully seeded at some point in the past, then I'm happy with it. Is there such a guarantee? I don't see that documented anywhere, and I'm not sure how it would be provided. Since /dev/urandom never blocks, I'm assuming it will return something based mostly on weak entropy sources (like system time and/or or a very small number of real bits of entropy) in the event that the system hasn't really had a chance to seed the pool yet. I can live with not reliably refreshing the pool, but its pretty scary if you think about what happens when a user boots their phone up for the first time, etc. and there is just no entropy there yet.
Shabsi
On Feb 14, 2011, at 3:03 PM, Allen Wirfs-Brock wrote:
And why overwrite the elements of an existing array? Why not just creating a new Array and use the argument to specify the desired length?
Just to respond to this, I believe the reusable buffer is an optimization (premature? perhaps not for those JS VMs that lack super-fast generational GC) to allow the API user to amortize allocation overhead across many calls to getRandomValues. Of course, with a fast enough GC or few enough calls, this optimization doesn't matter.
The IDL's use of an array inout parameter also supports efficient bindings for languages with stack allocation, which is a non-trivial win in C and C++ not only compared to malloc performance-wise, but also for automated cleanup (vs. intrinsic cost of free, plus on some OSes, "which free do I call"?).
On Feb 14, 2011, at 4:47 PM, Brendan Eich wrote:
The IDL's use of an array inout parameter
Er, in parameter, of course -- but with an in ArrayBufferView parameter (a reference to a mutable object), there's an "out" channel too, via effects on the viewed data in the buffer. Which is why I wrote "inout" :-/.
[whatwg at lists.whatwg.org removed from cc list because I'm not on it and MarkM suggested dropping it.]
On 2/14/11 at 4:20 PM, shabsi at google.com (Shabsi Walfish) wrote:
Hmm... if there is a guarantee that /dev/urandom was successfully seeded at some point in the past, then I'm happy with it. Is there such a guarantee? I don't see that documented anywhere, and I'm not sure how it would be provided. Since /dev/urandom never blocks, I'm assuming it will return something based mostly on weak entropy sources (like system time and/or or a very small number of real bits of entropy) in the event that the system hasn't really had a chance to seed the pool yet. I can live with not reliably refreshing the pool, but its pretty scary if you think about what happens when a user boots their phone up for the first time, etc. and there is just no entropy there yet.
While I think this question is a bit outside the language area, as an OS person, it falls directly in my lap, so please indulge me while I pontificate.
Getting unguessable bits into a deterministic system like a computer is very platform dependent. Disk timing, variable because of air turbulence on the platters, has been a popular source. However now many systems have solid state disks. This dependency means we are at the mercy of the system builders.
I don't know that there are any guarantees. Consider the environments:
Servers pcs phones/tablets (others?)
I think the worst case is servers. They live in a temperature controlled room, and the only I/O they have is a network connection and a disk farm. With some luck, the server operator will have a source of unguessable bits included on a crypto board or on one of the manufacture's support chips. Otherwise there's timings on the disk and network I/O. A number of the CPU chips have "cycle counter" which can provide high resolution timing for interrupts etc.
Pcs (lower case because I mean it generically) have a number of good sources involved with UI events, microphones etc. There is time between boot and first application launch to gather mouse movements, ambient sound etc.
Phones/tablets have a wonderful set of resources: radio noise, compass direction, GPS location tracking, microphone input, camera input, etc. They should be able to get enough unguessable bits within a few seconds, certainly by the time they've established network connectivity.
Cheers - Bill
Bill Frantz | gets() remains as a monument | Periwinkle (408)356-8506 | to C's continuing support of | 16345 Englewood Ave www.pwpconsult.com | buffer overruns. | Los Gatos, CA 95032
On Mon, Feb 14, 2011 at 7:20 PM, Shabsi Walfish <shabsi at google.com> wrote:
Hmm... if there is a guarantee that /dev/urandom was successfully seeded at some point in the past, then I'm happy with it. Is there such a guarantee? I don't see that documented anywhere, and I'm not sure how it would be provided. Since /dev/urandom never blocks, I'm assuming it will return something based mostly on weak entropy sources (like system time and/or or a very small number of real bits of entropy) in the event that the system hasn't really had a chance to seed the pool yet. I can live with not reliably refreshing the pool, but its pretty scary if you think about what happens when a user boots their phone up for the first time, etc. and there is just no entropy there yet.
There's no strict guarantee, since there's no API for /dev/urandom to refuse to generate data if there's no entropy whatsoever. I suppose software can read entropy_avail to make sure that there's at least something in there, though I don't know if anything actually does that, and if that's something entropy_avail is actually meant to be used for.
I don't think trying to expand this API for this--for example, a parameter saying "throw an exception if the entropy buffer is empty"--would be helpful. The cases that this happens are extremely narrow, and essentially nonexistant on a running system; it would either never be used or result in extremely poorly-tested code paths (both in UAs and in user code), and would complicate things quite a lot (how long do you wait before trying again?).
(As an aside, I also don't know the state of hardware entropy generation on modern hardware, which can likely generate entropy much more robustly than the old hacks of mixing in interrupt timing and mouse movement, eliminating the old problem of filling the entropy buffer from a first-time cold boot.)
but we're now talking about a pure ECMAScript function so DOM conventions shouldn't be applicable. But consistency with common JavaScript practices should be.
If you want to apply it to an already allocated array then making it method on Array.prototype would be a more internally consistent way to do it.
Although, I also a bit partial to the string based formulation and I assume that relatively small string values should be fairly efficient to allocated.
In either case, it does sound like premature optimization relative to everything else that is likely to beging on the the JS code that actually uses these random values.
Brendan Eich wrote:
I'm happy if the answer is "all OSes, mobile and desktop, provide enough high quality randomness". Looking for data if anyone has it.
Here's what I've been able to find.
All desktop OSes that I'm aware of provide a way to get crypto-quality randomness, including Windows, Mac, and Unix.
For mobile OSes, it looks like the following OSes provide a way to get crypto-quality randomness: Android, Nokia's Maemo, MeeGo, iOS on iPhone, Blackberry OS. I can't attest to that personally; it is based on the following: security.stackexchange.com/q/2152/971
On Feb 16, 2011, at 12:45 PM, David Wagner wrote:
Brendan Eich wrote:
I'm happy if the answer is "all OSes, mobile and desktop, provide enough high quality randomness". Looking for data if anyone has it.
Here's what I've been able to find.
All desktop OSes that I'm aware of provide a way to get crypto-quality randomness, including Windows, Mac, and Unix.
For mobile OSes, it looks like the following OSes provide a way to get crypto-quality randomness: Android, Nokia's Maemo, MeeGo, iOS on iPhone, Blackberry OS. I can't attest to that personally; it is based on the following: security.stackexchange.com/q/2152/971
I'll add that the Symbian mobile OS also provides a crypto-quality RNG, as described at library.forum.nokia.com/topic/S60_5th_Edition_Cpp_Developers_Library/GUID-35228542-8C95-4849-A73F-2B4F082F0C44/sdk/doc_source/guide/Security-subsystem-guide/Crypto/RNG/index.html#Crypto.rng.index
,
Shabsi Walfish writes:
I don't disagree with this -- however, I would say the problem is that urandom may not return 160 "unguessable" bits to start with. Unguessable, of course, actually means "extremely difficult to predict with better than probability 1/2, conditioned on all previously observable information (such as approximate system time, process ids, etc.)". In any case, I was simply suggesting it would be nice to make available information about the quality of the entropy being returned by /dev/urandom, which is either going to be good (because the entropy pool was full) or possibly not good (perhaps because the entropy pool hasn't ever filled up yet).
Well, personally I don't see any need, and I think it's an unnecessary distraction. I don't see anything that should hold up window.crypto.getRandom. I don't think this even rises to the level of nice-to-have, but even if it did, I think the window.crypto.getRandom effort should focus on must-haves, and not get hung up on nice-to-haves.
To explain why I don't think there's any reason to add a way to query the quality of the pseudorandomness returned by /dev/urandom, I have to geek out on cryptographer stuff. To all the non-cryptographers: please ignore the crypto geeking out. The bottom line is, I think /dev/urandom is fine, I think it's perfectly adequate for the job, and I see no basis for delaying or complicating the window.crypto.getRandom API.
< cryptographer geek-out >
In my opinion, there's no need to check on the quality of /dev/urandom; returning unguessable bits is its entire job and purpose for being.
There's a great deal of FUD and misunderstandings out there about /dev/urandom, partly based upon a confusion between information-theoretic security and computational-infeasibility security. It's not helped by the fact that even the /dev/urandom man page lends support to the misconceptions.
Here's the issue. /dev/urandom is computationally secure. No, it's not information-theoretically secure, but who cares? Information theoretic security is irrelevant in practice; it's basically only a concern for theoretical la-la land. (Do we reject all modern cryptography, and insist that only the one-time pad is good enough for us? Of course not. That'd be ridiculous.) /dev/urandom is a PRNG, not a RNG, and that's plenty good enough for all the use cases we're talking about.
If you hear someone talk about a concern that the "entropy pool might get drained", that's irrelevant to /dev/urandom. That'd only be relevant if we were talking about information-theoretic security -- but we're not. With a computationally-secure PRNG like /dev/urandom, once it's got 160 bits of unguessable entropy in its pool, it's good forever.
What about the concern that the entropy pool might never have gotten enough entropy in the first place? I think that's exceedingly unlikely on any system that will be running a browser. In every OS distribution I've ever seen, there is a mechanism to ensure that entropy carries over across boots: usually a shutdown-time script that dumps /dev/urandom's entropy pool to disk, and a boot-time script that reads this saved pool and adds it to the entropy pool. Thus, as long as the /dev/urandom pool ever has enough entropy, it always will from there on, including after every subsequent boot.
What about on first boot? Personally, I strongly expect that by the time the browser gets going and Javascript gets a chance to query this interface, there will be sufficient entropy in the /dev/urandom pool. If we're talking about a browser, then it's probably running on a machine with peripherals (e.g., keyboard, mouse, touch screen, or something along those lines).
In general, I suspect /dev/urandom is very unlikely to be the weakest point in the system.
< / cryptographer geek-out >
So, coming back to the discussion: I don't see any barriers to window.crypto.getRandom. As far as I can tell, modern OSes offer enough support for getting randomness that it should not be difficult for browsers to obtain the necessary randomness needed to implement window.crypto.getRandom. And I think it's perfectly adequate to have a simple interface that exposes only one function: get some crypto-quality random bytes. I don't see a need for a more elaborate API.
[re-sending to es-discuss]
Shabsi Walfish wrote:
This depends on what you consider to be the basic use case. Generating long-lived cryptographic keys absolutely requires high quality entropy... if you are only generating short-lived authenticators (that are not used for encryption) then you could get away with weaker entropy. You will get the most mileage out of this feature if it can be used to generate encryption keys, or long-lived signing keys.
Personally, I think discussion about the "quality" of the PRNG is a distraction. The PRNG should produce crypto-quality random numbers. Period. That's all that need be said. That's good enough. It's good enough for short-lived authenticators, good enough for encryption keys, good enough for any signing key that's going to be used in Javascript. It's just plain good enough.
There's no need for an interface to request or query or specify the quality or entropy of the random numbers. Callers should be able to rely upon it to be crypto-quality. Browsers can deliver on that.
My advice is: Keep the API as simple as it possibly can be. Don't get distracted with twirly knobs and added complications. A simple API will be plenty to get the job done. Stay on target.
Shabsi Walfish writes:
I think the question is whether or not this can be done in general without blocking -- if you block, you can definitely get crypto-quality random numbers, but otherwise its not clear... Personally, I think it makes sense for the browser to have its own PRNG, and to get the seed from a high-quality (potentially blocking) source asynchronously, immediately after startup (e.g., using /dev/random for Linux). That way, you can "block" on the JS level api call until the built-in PRNG has been properly seeded, since that blocking scenario should basically never happen in practice... and if it does, users would probably just attribute it to the normal churn during browser startup anyway.
I guess I don't see this as worth worrying about.
/dev/urandom is a high-quality source. It does not block.
Here's how to get crypto-quality random numbers without blocking:
$ time dd if=/dev/urandom count=1 bs=20 status=noxfer of=/dev/null 1+0 records in 1+0 records out [...] 0.00s user 0.00s system 0% cpu 0.001 total
160 bits of randomness obtained, without blocking.
I don't see any reason why browsers would need to use /dev/random. That feels to me like adding an unnecessary complication where none is needed.
I don't see anything here that would justify holding up or complicating window.crypto.getRandom.
Maybe I'm just missing something here, but again, imagine a brand new Android phone that was just booted up for the first time (Android basically runs Linux under the hood). Will the output from /dev/urandom (which is not being specially seeded with accelerometer measurements, etc.) necessarily be unpredictable, given the model of phone and approximate time? I'm not convinced. So just relying on /dev/urandom seems dangerous to me for that reason.
The danger doesn't feel very dangerous to me. I think it would be reasonable for browsers to not worry about this and let the Android OS worry about that.
In more detail, I'd respond in five ways:
- This feels to me like an awfully obscure corner case to design an entire API around. Put another way, the risk we're talking about seems like a minor one to me.
- By the time the user launches the browser and starts interacting with web sites, I would hope and expect that /dev/urandom would have sufficient entropy, though I've never measured it and could be wrong.
- If for some reason it doesn't, then I'd say that's an Android bug, not a browser bug. It's the OS's responsibility to ensure that /dev/urandom's output is unpredictable.
- If you really care about this and want to classify it as a browser bug, there are ways that browsers can deal with it (e.g., detect the first boot condition and take special steps), but I think they'd be adding unnecessary gyrations. It'd be up to each individual browser whether they want to worry about this, but if they do, there are steps they can take.
- The SSL protocol already requires the browser to generate crypto-quality random numbers any time the browser accesses a site over https. Whatever the browser is already doing to generate random numbers for SSL, doing the same for window.crypto.getRandom should be perfectly adequate. Browsers already have to solve this problem; adding window.crypto.getRandom is not adding a new burden on them.
Overall, I think /dev/urandom is good enough for engineering purposes, and I don't see that this warrants changing the API. Just my opinion.
Thanks, this matches what I heard from mobile folks we are working with.
Shabsi Walfish wrote (quoting from the urandom(4) man page):
A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current non-classified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.
This is total FUD. I've long complained about the fact that this is in the urandom(4) man page, as it leads to widespread misconceptions, but it's never been fixed. I don't want to waste the time of people on this mailing list deconstructing this statement in detail, so I'll just say:
Please ignore this part of the /dev/urandom man page. It's bogus and not a good source for how to think about crypto-quality randomness.
(To share an analogy, the quote above is analogous to saying "SSL is theoretically vulnerable to a cryptographic attack on the algorithms it uses. Knowledge of how to do this is not available in the non-classified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, turn off your computer instead.")
On Wed, Feb 16, 2011 at 11:13 AM, David Wagner <daw at cs.berkeley.edu> wrote:
Shabsi Walfish wrote (quoting from the urandom(4) man page):
A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current non-classified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.
This is total FUD. I've long complained about the fact that this is in the urandom(4) man page, as it leads to widespread misconceptions, but it's never been fixed. I don't want to waste the time of people on this mailing list deconstructing this statement in detail,
Hi David, please feel free to, or to point at pages where we can read more about this specific issue. This issue seems to be the only significant remaining controversy here, so more words settling it more decisively would be welcome. Thanks.
On Wed, Feb 16, 2011 at 11:31 AM, Mark S. Miller <erights at google.com> wrote:
On Wed, Feb 16, 2011 at 11:13 AM, David Wagner <daw at cs.berkeley.edu>wrote:
Shabsi Walfish wrote (quoting from the urandom(4) man page):
A read from the /dev/urandom device will not block waiting for more entropy. As a result, if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack on the algorithms used by the driver. Knowledge of how to do this is not available in the current non-classified literature, but it is theoretically possible that such an attack may exist. If this is a concern in your application, use /dev/random instead.
This is total FUD. I've long complained about the fact that this is in the urandom(4) man page, as it leads to widespread misconceptions, but it's never been fixed. I don't want to waste the time of people on this mailing list deconstructing this statement in detail,
Hi David, please feel free to, or to point at pages where we can read more about this specific issue. This issue seems to be the only significant remaining controversy here, so more words settling it more decisively would be welcome. Thanks.
Sorry. I was reading email non-chronologically this morning. I see that you have posted much more. If you feel this point is now adequately covered, please ignore. Thanks.
I agree with this sentiment, the specification should simply state "the returned values are guaranteed to be cryptographically secure", that's all that needs to be said. There is no need to describe how this is implemented, if an implementation provides predictable values then that implementation is broken. If your prng has low entropy it is not cryptographically secure, and therefore broken. How you get to cryptographically secure isn't really relevant as long as you can get there*.
This is of course entirely ignoring timing attacks, and other such side channels.
--Oliver
- If a platform can't provide cryptographically secure random i'd be concerned about that implementation being web facing as i would be dubious of the theoretical security of its SSL implementation, etc.
On Wed, Feb 16, 2011 at 11:36 AM, Oliver Hunt <oliver at apple.com> wrote:
I agree with this sentiment, the specification should simply state "the returned values are guaranteed to be cryptographically secure", that's all that needs to be said. There is no need to describe how this is implemented, if an implementation provides predictable values then that implementation is broken. If your prng has low entropy it is not cryptographically secure, and therefore broken. How you get to cryptographically secure isn't really relevant as long as you can get there*.
This is of course entirely ignoring timing attacks, and other such side channels.
--Oliver
- If a platform can't provide cryptographically secure random i'd be concerned about that implementation being web facing as i would be dubious of the theoretical security of its SSL implementation, etc.
Hi Oliver, not disagreeing with anything you said. But in light of this footnote, I want to remind everyone that JS is not necessarily web facing. Normative EcmaScript standards impose an obligation on any implementation wishing to claim standards conformance. That said, I support adding this normative obligation on all conformant EcmaScript implementations.
An almost-conformant implementation that does not support one single function in the standard API can generally claim "conformant except for X" so I'm not too worried about such non-web cases.
On Feb 16, 2011, at 11:31 AM, Mark S. Miller wrote:
This issue seems to be the only significant remaining controversy here, so more words settling it more decisively would be welcome. Thanks.
Not to drag this out, but the "randomness quality" issue (or non-issue, now) is not the only remaining one. Allen raised the question of whether typed arrays should be used at all when Array might do. This was in reply to your suggestion that the "sooner not later" simple and usable API depend on the forthcoming binary_data strawman.
I'm not sure what the browser vendors who don't yet support typed arrays think. Purely out of expediency and "I'm all right, Jack" Firefox boosting, typed arrays are fine and have some advantages over plain old Array, as Adam has laid out.
But ideally, we should hash this out with Microsoft people weighing in here on es-discuss (I'm told they're not allowed to participate on whatwg lists).
I miss the Opera presence
On Wed, Feb 16, 2011 at 11:05 AM, David Wagner <daw at cs.berkeley.edu> wrote:
Shabsi Walfish writes:
I think the question is whether or not this can be done in general without blocking -- if you block, you can definitely get crypto-quality random numbers, but otherwise its not clear... Personally, I think it makes sense for the browser to have its own PRNG, and to get the seed from a high-quality (potentially blocking) source asynchronously, immediately after startup (e.g., using /dev/random for Linux). That way, you can "block" on the JS level api call until the built-in PRNG has been properly seeded, since that blocking scenario should basically never happen in practice... and if it does, users would probably just attribute it to the normal churn during browser startup anyway.
I guess I don't see this as worth worrying about.
/dev/urandom is a high-quality source. It does not block.
Here's how to get crypto-quality random numbers without blocking:
$ time dd if=/dev/urandom count=1 bs=20 status=noxfer of=/dev/null 1+0 records in 1+0 records out [...] 0.00s user 0.00s system 0% cpu 0.001 total
160 bits of randomness obtained, without blocking.
I don't see any reason why browsers would need to use /dev/random. That feels to me like adding an unnecessary complication where none is needed.
I don't see anything here that would justify holding up or complicating window.crypto.getRandom.
Maybe I'm just missing something here, but again, imagine a brand new Android phone that was just booted up for the first time (Android basically runs Linux under the hood). Will the output from /dev/urandom (which is not being specially seeded with accelerometer measurements, etc.) necessarily be unpredictable, given the model of phone and approximate time? I'm not convinced. So just relying on /dev/urandom seems dangerous to me for that reason.
The danger doesn't feel very dangerous to me. I think it would be reasonable for browsers to not worry about this and let the Android OS worry about that.
Thanks for the explanation David. I'm okay with pushing this off on some combination of the OS + JS interpreter, but just want it to be clear what the expectation should be for clients of the API. I agree with the way Oliver phrased it below -- the spec should simply demand that the returned values are cryptographically secure (cryptographically secure PRNG output is acceptable for that, there is certainly no need for info theoretic). If the implementations don't provide that guarantee its a bug in the OS or JS interpreter or both. If /dev/urandom is sufficient for that guarantee in practice, I'm perfectly happy with it.
Mark's point about JS not running exclusively in the browser simply suggests that some JS interpreters on some platforms might need to do something like query entropy_avail at startup and wait for a while if need be, in order to provide this guarantee. I agree we don't need to expose entropy_avail in the API, given the behavior of /dev/urandom as you described here.
Shabsi
On Mon, Feb 14, 2011 at 8:03 PM, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:
but we're now talking about a pure ECMAScript function so DOM conventions shouldn't be applicable. But consistency with common JavaScript practices should be.
If you want to apply it to an already allocated array then making it method on Array.prototype would be a more internally consistent way to do it.
Although, I also a bit partial to the string based formulation and I assume that relatively small string values should be fairly efficient to allocated.
In either case, it does sound like premature optimization relative to everything else that is likely to beging on the the JS code that actually uses these random values.
This isn't an optimization; it's simply a different design. Optimization is one underlying rationale, but not the only one.
I don't think optimization is an important consideration here for crypto, since you usually don't need a lot of random data to generate a key--often on the order of a few dozen or hundred bytes. You're not reading a megabyte of data to generate a key. There can be uses for generating large blocks of randomness for non-crypto, though, of course.
However, it also allows specifying the type of the data to return: whether the random data should be interpreted as 8-bit arrays, 16-bit arrays, and so on. I don't know if that's actually important (havn't seen specific use cases), but that was another rationale. I think it was also intended to avoid encoding 8-bit data in a UTF-16 string.
hey along these lines, does anyone track strawman implementations in various vendor browser nightlies? Something like kangax.github.com/es5-compat-table but for the various harmony proposals.
kam
So, let's get back to the design of an actual ECMAScript API.
I'll repeat a couple of initial points: We are now talking about a pure JavaScript API, not a DOM API so it should follow JavaScript conventions. If we want this to appear in browsers in the very near future we should minimize any dependencies upon other Harmony features that are not both totally finalized and trivial to implement (the empty set??)
The first point would seem to eliminate DOM-like conventions such as pseudo by-ref/in-our parameters, multiple numeric data types, or dependency upon any host object (DOM) array like objects. The second would seem to preclude the use of the proposed Harmony binary typed arrays.
In general, I don't see any particular reason why an ECMAScript array would be unsuitable to collecting a sequence of random numeric values. It's what people do today when they need to manipulate sequences of numbers.
If we are using array's, the simplest API is probably a new function on the array construct that is a factory for Arrays fill with random numeric values:
Array.randomFill= function (length){...};
This would create a new array of the specified length where each element is a random value in some range. I propose that this range be 0..65535 as these are easily manipulatable (and optimizable) values that are likely to have some degree of optimization in all implementations. I don't really see why other element ranges need to be supported as they are easily constructed these values. However, if strong use cases (endianness??) can be made for the utility of natively supporting multiple element ranges then we could either add more functions: Array.randomFill8(length) Array.randomFill24(length) Array.randomFill32(length) or accept an optimal second parameter that specifies the desired bit width: Array.randomFill(length, bitWidth /*1 .. (32?, 48?, 53?) */)
I would recommend keeping it simple and avoiding such non-essential tweaks.
There was some (it didn't seem very serious) concerns about the cost of array allocation for such calls. I would point out that creating a new array is already the technique used in ECMAScript for most somewhat comparable situations such as the results of RegExp matches, the set of items deleted by slice, etc. Others might invoke words about the desirability of a functional style...
If the array allocation was really a concern, then the randomFill function should be a method on Array.prototyoe: Array.prototype.randomFill= function () {...}; /* return the this value */
it might be used like: var rand = new Array(8).randomFill(); //create an array of 8 random 16-bit numbers
The length parameter is unnecessary because the length is implicit in the array (and if this is about perf. you probably don't want to be dynamically changing the length of already allocated arrays). The same discussion of element size alternatives also applies.
Of course we could do both. But really, if we want this to show up very soon in browsers keeping it to a single function or methods with minimal options is likely to yield the best results.
Finally, as a dark horse we could consider using Strings rather than Arrays as JavaScript strings are sequences of 16-bit values and the individual element values are directly accessible. String.randomFill(length) //returns a string value containing length random 16-bit character values.
Individual values would be accessed as: var rs=String.randomFill(2); var random32=(rs.charCodeAt(0)<<16)|rs.charCodeAt(1);
But really, Arrays seem like they should work just fine.
Too much and too complex, say I! ;)
I liked your previous thinking that we should just return a fresh array. And in the interest of keeping it dead simple, let's just fix it at 32 bits -- always.
Then the question is: string, array, typed array, or binary data?
I say: let's make it typed array in the short term, but TC39 will spec it as an array of uint32 according to the binary data spec. We will try to make the binary data spec as backwards compatible as possible with typed arrays anyway. So in the near term, implementors can use typed arrays, but when they have implementations of the full binary data spec, they can change to use those. It'll probably be a slightly backwards-incompatible change, but by keeping them relatively API-compatible, it shouldn't make too much difference in practice. Plus we can warn people that that change is coming.
So: Math.getRandomValues(length) or whatever you wanna call it, would just return a fresh array.
I might also be ok with extending it so that you can pass it an optional pre-existing array for it to mutate, for the sake of efficiency: Math.getRandomValues(length[, uint32array[, start]]).
(The window.crypto object could just be a reference to the same Math.getRandomValues function value.)
On Wed, Feb 16, 2011 at 4:29 PM, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
So, let's get back to the design of an actual ECMAScript API.
I'll repeat a couple of initial points: We are now talking about a pure JavaScript API,.... ....
Good stuff.
Recall that part of what started this is that folks asked for improvements to math.random but math.random had become a key function in many benchmarks. Improving math.random would have the side effect of slowing down benchmark results so some felt touching is was problematic.
Then there were some discussions regarding security hacks where guessing or inserting well guessed or spoofed results could compromise business or privacy. One post had a good yet short list of links that made it clear that Math.random as it is, is fragile.
Rather than define a full and new API, an alternate tactic might be to code up some "quality" benchmarks metrics where Math.random would be improved because the benchmarks that identified the quality of the random numbers and gave better random tools more weighted value than just wall clock speed.
The other issue that surfaced was a lack of rand-seed management policy. If corrected that would effectively make it nearly impossible for one script to guess anything that might go on in another connection.
It is important for the old Math.random and any new API that rand-seed management policy be well addressed.
I only intend to make the point that a new API will not repair Math.random as used now by numerous sites. Adding a new API will give sites new ways to get it wrong.
On Feb 16, 2011, at 4:54 PM, David Herman wrote:
I say: let's make it typed array in the short term, but TC39 will spec it as an array of uint32 according to the binary data spec. We will try to make the binary data spec as backwards compatible as possible with typed arrays anyway. So in the near term, implementors can use typed arrays, but when they have implementations of the full binary data spec, they can change to use those. It'll probably be a slightly backwards-incompatible change, but by keeping them relatively API-compatible, it shouldn't make too much difference in practice. Plus we can warn people that that change is coming.
Dave, most browsers other than FF4 internally box all integers with with 32-significant bits. Some may box with 31 or even 30 significant bits. So if we spec. the value as a uint32 and (they are truly random enough) then at least half and possible 75% or more of the values in the array will be boxed in many browsers. Such boxing will have a much higher cost than immediate uint16 values. That's why I propose 16-bit values.
On Feb 16, 2011, at 5:33 PM, Allen Wirfs-Brock wrote:
On Feb 16, 2011, at 4:54 PM, David Herman wrote:
I say: let's make it typed array in the short term, but TC39 will spec it as an array of uint32 according to the binary data spec. We will try to make the binary data spec as backwards compatible as possible with typed arrays anyway. So in the near term, implementors can use typed arrays, but when they have implementations of the full binary data spec, they can change to use those. It'll probably be a slightly backwards-incompatible change, but by keeping them relatively API-compatible, it shouldn't make too much difference in practice. Plus we can warn people that that change is coming.
Dave, most browsers other than FF4 internally box all integers with with 32-significant bits.
I'm not sure this is still true. Certainly on x64, but also on x86, NaN-boxing has taken over in many VMs.
Some may box with 31 or even 30 significant bits. So if we spec. the value as a uint32 and (they are truly random enough) then at least half and possible 75% or more of the values in the array will be boxed in many browsers. Such boxing will have a much higher cost than immediate uint16 values. That's why I propose 16-bit values.
Given the implementors on es-discuss, we should survey first. I'd hate to prematurely (de-)optimize.
I agree with David Wagner that the API has to be dead-simple, and it seems to me having only 16-bit values returned in a JS array may tend to result in more bit-mixing bugs than if we used 32-bit elements, if programmers are carelessly porting code that was written for uint32 arrays.
On Thu, Feb 17, 2011 at 08:29, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
Array.randomFill= function (length){...};
This would create a new array of the specified length where each element is a random value in some range. I propose that this range be 0..65535 as these are easily manipulatable (and optimizable) values that are likely to have some degree of optimization in all implementations. I don't really see why other element ranges need to be supported as they are easily constructed these values. However, if strong use cases (endianness??) can be made for the utility of natively supporting multiple element ranges then we could either add more functions: Array.randomFill8(length) Array.randomFill24(length) Array.randomFill32(length) or accept an optimal second parameter that specifies the desired bit width: Array.randomFill(length, bitWidth /*1 .. (32?, 48?, 53?) */)
I would recommend keeping it simple and avoiding such non-essential tweaks.
There was some (it didn't seem very serious) concerns about the cost of array allocation for such calls. I would point out that creating a new array is already the technique used in ECMAScript for most somewhat comparable situations such as the results of RegExp matches, the set of items deleted by slice, etc. Others might invoke words about the desirability of a functional style...
As Glenn pointed out, efficiency is just one part of the concern. Flexibility is the main one. The fact you struggle to propose an API that can satisfy everyone with 4 different randomFill functions rings 'code smell' to me.
You are making policy decisions on what kind of values (eg "only 0..65535") you can get from it and might severely hamper use cases where efficiency is important (eg. just on top of my head, stream random data to WebGL or WebGL) while it is always possible to create an array before the call if you want to (or write a 3 line wrapper to do that).
In fact most other languages/frameworks work this way, and I'm not sure it makes sense in the first place to have this kind of specific 'seldom used' function in the language and/or such a basic and commonly used type as Array, compared to have it in a type providing crypto features (eg. <global>.crypto).
,
I don't think generating 16bit values is beneficial -- either 8bit values for byte at a time processing or full [u]int32 makes more sense. I think the only reason for 16bit to come up is ecmascript's notionally 16bit characters.
I would prefer polymorphism of some form, for example
any gimmeRandom(numberOfRandomElements, constructor = [[]Builtin Array constructor], min = 0, max = 1<<32 - 1) { var result = new constructor(numberOfRandomElements); for (var i = 0; i < numberOfRandomElements; i++) result[i] = reallyRandomValue(min, max); return result; }
This would solve all the use cases presented so far (except a string of cryptographically secure values, which can be handled trivially so i've left that as a task for the reader ;) ), and would avoid the shortcomings of the existing array methods (only a finite set of types can be produced as output).
Having got up to speed on this discussion after a holiday here are my thoughts from a V8er:
-
I would prefer to see this in the DOM rather than in the language. Can't see why it needs to be in the language itself. Other languages generally don't have cryptographically secure random numbers in the language itself, they put them in the libraries. Other users of the DOM www.w3.org/DOM/Bindings might want to use the feature too. On the other hand this no big deal. If the committee feels that putting it in the language makes it somehow more righteous then put it there.
-
But let's not invoke node.js as a reason to put it in the language. Generally their policy is not to put things in the core that could be libraries. Seems to me that putting 'var crypto = require("crypto");' works just fine for node and can give you a completely compatible interface.
-
Lets keep the interface simple. All considerations of GC or effciency look completely specious to me given the amount of data involved (measured in bits). Returning strings seems strange (crypto data is numbers. What about surrogate pairs - do we really want something in the standard that returns incorrect UTF-16?). I would be in favour of any interface that returns regular arrays of numbers or even simpler just a single number, either 0-1 like Math.random() or 8- or 32-bit values. But only one function please.
-
I realize that Math.random() is not an ideal interface to emulate. The reason I suggest it is that putting a new API into the core language that is completely unlike Math.random() feels very PHP like: www.tnx.nl/php.html Even though Math.random() and the new interface are not interchangeable it seems to me that there is value in consistency in terms of mental effort needed to to memorize the APIs.
2011/2/18 Oliver Hunt <oliver at apple.com>:
On Feb 22, 2011, at 5:31 AM, Erik Corry wrote:
Having got up to speed on this discussion after a holiday here are my thoughts from a V8er:
I would prefer to see this in the DOM rather than in the language. Can't see why it needs to be in the language itself. Other languages generally don't have cryptographically secure random numbers in the language itself, they put them in the libraries. Other users of the DOM www.w3.org/DOM/Bindings might want to use the feature too. On the other hand this no big deal. If the committee feels that putting it in the language makes it somehow more righteous then put it there.
But let's not invoke node.js as a reason to put it in the language.
+1
Generally their policy is not to put things in the core that could be libraries. Seems to me that putting 'var crypto = require("crypto");' works just fine for node and can give you a completely compatible interface.
Lets keep the interface simple. All considerations of GC or effciency look completely specious to me given the amount of data involved (measured in bits). Returning strings seems strange (crypto data is numbers. What about surrogate pairs - do we really want something in the standard that returns incorrect UTF-16?). I would be in favour of any interface that returns regular arrays of numbers or even simpler just a single number, either 0-1 like Math.random() or 8- or 32-bit values. But only one function please.
I realize that Math.random() is not an ideal interface to emulate. The reason I suggest it is that putting a new API into the core language that is completely unlike Math.random() feels very PHP like: www.tnx.nl/php.html Even though Math.random() and the new interface are not interchangeable it seems to me that there is value in consistency in terms of mental effort needed to to memorize the APIs.
2011/2/18 Oliver Hunt <oliver at apple.com>:
I don't think generating 16bit values is beneficial -- either 8bit values for byte at a time processing or full [u]int32 makes more sense. I think the only reason for 16bit to come up is ecmascript's notionally 16bit characters.
I would prefer polymorphism of some form, for example
any gimmeRandom(numberOfRandomElements, constructor = [[]Builtin Array constructor], min = 0, max = 1<<32 - 1) { var result = new constructor(numberOfRandomElements); for (var i = 0; i < numberOfRandomElements; i++) result[i] = reallyRandomValue(min, max); return result; }
This would solve all the use cases presented so far (except a string of cryptographically secure values, which can be handled trivially so i've left that as a task for the reader ;) ), and would avoid the shortcomings of the existing array methods (only a finite set of types can be produced as output).
--Oliver
On Feb 16, 2011, at 5:37 PM, Brendan Eich wrote:
On Feb 16, 2011, at 5:33 PM, Allen Wirfs-Brock wrote:
On Feb 16, 2011, at 4:54 PM, David Herman wrote:
I say: let's make it typed array in the short term, but TC39 will spec it as an array of uint32 according to the binary data spec. We will try to make the binary data spec as backwards compatible as possible with typed arrays anyway. So in the near term, implementors can use typed arrays, but when they have implementations of the full binary data spec, they can change to use those. It'll probably be a slightly backwards-incompatible change, but by keeping them relatively API-compatible, it shouldn't make too much difference in practice. Plus we can warn people that that change is coming.
Dave, most browsers other than FF4 internally box all integers with with 32-significant bits.
I'm not sure this is still true. Certainly on x64, but also on x86, NaN-boxing has taken over in many VMs.
Some may box with 31 or even 30 significant bits. So if we spec. the value as a uint32 and (they are truly random enough) then at least half and possible 75% or more of the values in the array will be boxed in many browsers. Such boxing will have a much higher cost than immediate uint16 values. That's why I propose 16-bit values.
Given the implementors on es-discuss, we should survey first. I'd hate to prematurely (de-)optimize.
I agree with David Wagner that the API has to be dead-simple, and it seems to me having only 16-bit values returned in a JS array may tend to result in more bit-mixing bugs than if we used 32-bit elements, if programmers are carelessly porting code that was written for uint32 arrays.
/be
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es-discuss
-- Alex Russell slightlyoff at google.com slightlyoff at chromium.org alex at dojotoolkit.org BE03 E88D EABB 2116 CC49 8259 CF78 E242 59C3 9723
I'm from the "better to ask forgiveness than permission" school, so I'm rooting for Adam's work to catch on among browsers and ultimately reach w3c standardization.
Even the typed array usage is "ok" if we harmonize the tiny subset of typed arrays used there (no aliasing!) with strawman:binary_data -- which is dherman's plan.
If the crypto.getRandomValues API becomes popular and other JS embeddings than ones with the DOM joined at the hip want it, we can "move" (more likely, copy) it into a core language module spec. TC39 will be standardizing more "library code" over time.
I agree that Math.random is a coveted name and we shouldn't add something different in form but better in fit lightly.
However, Math.random is a source of bugs as Amit Klein has shown, and these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
On 22/02/2011, at 22:36, Brendan Eich wrote:
(...)
However, Math.random is a source of bugs as Amit Klein has shown, and these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
.replace( /1]/gm, '1)' ) ?
On Feb 22, 2011, at 2:00 PM, Jorge wrote:
On 22/02/2011, at 22:36, Brendan Eich wrote:
(...)
However, Math.random is a source of bugs as Amit Klein has shown, and these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
.replace( /1]/gm, '1)' ) ?
Right.
Reading more of Amit Klein's papers, the rounding to IEEE double also seems problematic. Again, I'm not the crypto-droid you are looking for.
I can find Klein's complaints that the implementation of Math.random is insecure but not his complaints about the API. Do you have a link?
It seems pretty simple to generate a random number from 1 to 2 by fixing the exponent and mixing in 52 bits of random mantissa. Subtract 1 to get an evenly distributed value from 0-1. Multiply and Math.floor or >>> to get
your 8, 16, or 32 bits of randomness. On Feb 22, 2011 11:04 PM, "Brendan Eich" <brendan at mozilla.org> wrote:
On Feb 22, 2011, at 2:00 PM, Jorge wrote:
On 22/02/2011, at 22:36, Brendan Eich wrote:
(...)
However, Math.random is a source of bugs as Amit Klein has shown, and
these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
.replace( /1]/gm, '1)' ) ?
Right.
Reading more of Amit Klein's papers, the rounding to IEEE double also
seems problematic. Again, I'm not the crypto-droid you are looking for.
On 2/22/11 at 1:36 PM, brendan at mozilla.com (Brendan Eich) wrote:
However, Math.random is a source of bugs as Amit Klein has shown, and these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
I'll see if the security expert hat fits. :-)
The random() function in many languages has a useful property which is incompatible with security. By setting its seed, you can get deterministic execution of a Monte Carlo algorithm. IANAJSE, but I didn't see a way to set the seed of Math.random(), so the ECMAScript/Javascript version lacks this useful property. But, having both a repeatable random function and a secure random function in a language is certainly reasonable.
Cheers - Bill
Bill Frantz |The nice thing about standards| Periwinkle (408)356-8506 |is there are so many to choose| 16345 Englewood Ave www.pwpconsult.com |from. - Andrew Tannenbaum | Los Gatos, CA 95032
On Feb 22, 2011, at 2:49 PM, Erik Corry wrote:
I can find Klein's complaints that the implementation of Math.random is insecure but not his complaints about the API. Do you have a link?
In the paper linked from seclists.org/bugtraq/2010/Dec/13 section 3 ("3. The non-uniformity bug"), viz:
"Due to issues with rounding when converting the 54 bit quantity to a double precision number (as explained in www.trusteer.com/sites/default/files/Temporary_User_Tracking_in_Major_Browsers.pdf section 2.1, x2 may not accurately represent the state bits if the whole double precision number is ≥0.5."
but that link dangles, and I haven't had time to read more.
The general concern about the API arises because Adam's API returns a typed array result that could have lenght > 1, i.e., not a random result that fits in at most 32 (or even 53) bits.
I didn't see a way to set the seed of Math.random(), so the ECMAScript/Javascript version lacks this useful property.
Butting in, for a moment... I thought JavaScript seeded the PRNG with the
timestamp by default? Perhaps I'm totally off base, but that's long been my
assumption, and the reason why I never questioned not having a seed()
function to set it, since the timestamp was fine for most non-crypto needs.
I also recall that being one of the main complaints about Math.random for
crypto needs.
But, having both a repeatable random function and a secure random function in a language is certainly reasonable.
If it's a choice between repeatable-random and actual random... I vote actual random with all my fingers and toes. Test harnesses can override Math.random() with fake repeatable sequences for that purpose. In the real world (non-testing), repeatable-random is far less desirable, at least I would think.
(butting back out)
On Feb 22, 2011, at 3:26 PM, Bill Frantz wrote:
On 2/22/11 at 1:36 PM, brendan at mozilla.com (Brendan Eich) wrote:
However, Math.random is a source of bugs as Amit Klein has shown, and these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
I'll see if the security expert hat fits. :-)
Thanks.
The random() function in many languages has a useful property which is incompatible with security. By setting its seed, you can get deterministic execution of a Monte Carlo algorithm. IANAJSE, but I didn't see a way to set the seed of Math.random(), so the ECMAScript/Javascript version lacks this useful property. But, having both a repeatable random function and a secure random function in a language is certainly reasonable.
Browsers have, based on Amit's work, added some automatic reseeding and (before that) switched from a singleton hidden state to state-per-window/iframe.
The "API issue" as Erik put it is this: do we need an array of bytes/shorts/ints, potentially a lot of random values; or would the fractional bits of a single IEEE 64-bit double precision result be "good enough".
Thanks for the link. Having read the section in question I am satisfied that the author has no problem with the API. On Feb 23, 2011 12:34 AM, "Brendan Eich" <brendan at mozilla.org> wrote:
On Feb 22, 2011, at 2:49 PM, Erik Corry wrote:
I can find Klein's complaints that the implementation of Math.random is
insecure but not his complaints about the API. Do you have a link?
In the paper linked from seclists.org/bugtraq/2010/Dec/13 section 3 ("3. The non-uniformity bug"), viz:
"Due to issues with rounding when converting the 54 bit quantity to a
double precision number (as explained in www.trusteer.com/sites/default/files/Temporary_User_Tracking_in_Major_Browsers.pdfsection 2.1, x2 may not accurately represent the state bits if the whole double precision number is ≥0.5."
but that link dangles, and I haven't had time to read more.
The general concern about the API arises because Adam's API returns a
typed array result that could have lenght > 1, i.e., not a random result
that fits in at most 32 (or even 53) bits.
On Tue, Feb 22, 2011 at 6:26 PM, Bill Frantz <frantz at pwpconsult.com> wrote:
On 2/22/11 at 1:36 PM, brendan at mozilla.com (Brendan Eich) wrote:
However, Math.random is a source of bugs as Amit Klein has shown, and
these can't all be fixed by using a better non-CS PRNG underneath Math.random and still decimating to an IEEE double in [0, 1]. The use-cases Klein explored need both a CS-PRNG and more bits, IIRC. Security experts should correct amateur-me if I'm mistaken.
I'll see if the security expert hat fits. :-)
The random() function in many languages has a useful property which is incompatible with security. By setting its seed, you can get deterministic execution of a Monte Carlo algorithm.
I don't agree that being able to query and set the state of a PRNG is incompatible with security. Most comprehensive, secure PRNG APIs support doing that, eg. LTC's "XXX_import" and "XXX_export" functions. In fact, you can even do that with /dev/urandom in Linux (only as root, since it's shared across users). Neither Math.random nor any of the APIs discussed so far are attempt to be comprehensive APIs, of course.
Using Math.random's API for secure random numbers wouldn't make sense, though. Secure random number sources from OS's give bytes, and most (all?) real-world crypto algorithms want bytes (or groups of bytes), so this would just be converting from bytes to floating-point and back to bytes anyway. That's just introducing a lot of new ways to get things wrong.
On 2/22/11 at 3:39 PM, brendan at mozilla.com (Brendan Eich) wrote:
The "API issue" as Erik put it is this: do we need an array of bytes/shorts/ints, potentially a lot of random values; or would the fractional bits of a single IEEE 64-bit double precision result be "good enough".
When doing crypto, mostly what you want is bit banging. For example, if you are implementing theRC4/ARC4 key schedule (From: en.wikipedia.org/wiki/Rc4) you are coding:
for i from 0 to 255 S[i] := i endfor j := 0 for i from 0 to 255 j := (j + S[i] + key[i mod keylength]) mod 256 swap values of S[i] and S[j] endfor
The AES process is similar (from: en.wikipedia.org/wiki/Advanced_Encryption_Standard):
High-level description of the algorithm 1. KeyExpansion—round keys are derived from the cipher key using Rijndael's key schedule 2. Initial Round 1. AddRoundKey—each byte of the state is combined with the round key using bitwise xor 3. Rounds 1. SubBytes—a non-linear substitution step where each byte is replaced with another according to a lookup table. 2. ShiftRows—a transposition step where each row of the state is shifted cyclically a certain number of steps. 3. MixColumns—a mixing operation which operates on the columns of the state, combining the four bytes in each column. 4. AddRoundKey 4. Final Round (no MixColumns) 1. SubBytes 2. ShiftRows 3. AddRoundKey
I always think of these operations as shifting and masking, but I'm really an Assembler/OS guy who likes being close to the hardware. :-)
Other uses will be packaging it into network messages, as in SSL/TLS key generation.
Are there other, non-crypto uses for secure random numbers?
The question for the Javascript experts is, what form will make implementing this kind of code easiest?
Cheers - Bill
Bill Frantz | The first thing you need when | Periwinkle (408)356-8506 | using a perimeter defense is a | 16345 Englewood Ave www.pwpconsult.com | perimeter. | Los Gatos, CA 95032
On Feb 22, 2011, at 3:45 PM, Erik Corry wrote:
Thanks for the link. Having read the section in question I am satisfied that the author has no problem with the API.
In theory, sure. Bits are bits.
The practical issue is usability, where less usable interfaces tend to breed more bugs, as I argued was a hazard of the proposal to return a plain old Array containing uint16 values as elements. Glenn Maynard's point about more to go wrong with IEEE double seem to be validated by the IE9 preview release Math.random bugs that Amit Klein found. From the crypto-hacker point of view, anything that makes it harder to get random uint{8,16,32} values than necessary seems that much less good.
If we have only number type for the result, then Math.random is the API template to match. Given typed arrays / binary data, Adam's API looks more usable, even counting the cost of differing from Math.random in its API signature.
On 2/13/11 8:22 PM, Adam Barth wrote:
Sure; the question is whether this particular API would be more appropriate at the language level. Or more to the point, if the language plans to grow it anyway, do we need two APIs for it?
It's worth at least checking with the ES folks whether they plan to add a API like this (something that fills in an array of bytes with cryptographically strong random values) in any sort of short-term timeframe.