Mark S. Miller (2013-11-11T18:12:50.000Z)
domenic at domenicdenicola.com (2013-11-17T17:55:30.344Z)
On Mon, Nov 11, 2013 at 9:25 AM, Jason Orendorff <jason.orendorff at gmail.com>wrote: > JS users do not want RPC systems where one process's memory usage > depends on getting per-object callbacks from an untrusted peer's GC > implementation. Some will. I do. See <http://research.google.com/pubs/pub40673.html>. Why do you believe manual deallocation decisions will be easier in distributed systems than they are locally? If anything, local manual deallocation should be easier, and these have already proven hard enough that people (except C++ programmers) have turned to local GC. You are correct that a distributed mutually suspicious system must support manual deallocation as well. Your Erlang example is quite telling: Erlang does have strong cross process references, the process id. However, because they are forgeable, processes cannot be garbage collected. The decision to terminate a process is the decision to preemptively terminate service to clients that may still exist. Sometimes this needs to be done, even with GC, because the client causes the service to retain more memory than the service wishes to continue to devote to this client. As the Erlang example also indicates, the natural unit for such preemptive manual deallocation decisions is the vat/worker/process. However, many clients will engage in honest GC to keep their requirements on service memory low. Many services will not need to cut such clients off because of excessive resource demands. E/CapTP and Cap'n Proto have an additional form of manual deallocation decision besides vat termination. Between pair of vats there are both "offline references" which survive partition and "live references" with last only up to partition. Because offline references can be reconnected after a partition, they are not subject to GC. Instead, E provides three hooks for manually deallocating them. From Chapter 17.4 "Persistence" of http://erights.org/talks/thesis/markm-thesis.pdf: The operations for making an offline capability provide three options for ending this obligation: It can expire at a chosen future date, giving the association a time-to-live. It can expire when explicitly cancelled, making the association revocable. And it can expire when the hosting vat incarnation crashes, making the association transient. An association which is not transient is durable. Since vats must be prepared for inter-vat partition, a vat can preemptively induce a partition with a counterparty vat, in order to preemptively sever the live references between them, forcing reconnection to rely on the offline references subject to the above manual policies. In E/CapTP and Cap'n Proto, the distributed GC governs only these transient live refs, which substantially reduces the pressure to preemptively sever these connections. (The NodeKen system on which Dr. SES will be built does not make this split, forcing it to rely on manually deallocation of vats rather than connections, in order to manually reclaim memory. There is a place in the world for each failure model. Here I am arguing only for the CapTP/Cap'n Proto failure model.) Ultimately, the only principled solution for distributed storage management among mutually suspicious machines is some form of quid pro quo, such as the [market-sweep algorithms](http://e-drexler.com/d/09/00/AgoricsPapers/agoricpapers/ie/ie3.html). But even after 25 years, these still seem premature. Distributed GC + preemptive deallocation for extreme conditions, either of vats or connections, is a great compromise in the meantime.