Identifying pure (or "pure within a scope") JavaScript functions?
How would "purity" be determined? Would it depend on if the function is a closure on any outer scope? Would it depend on whether the function mutates any objects passed to it? What counts as mutation?
It seems like it might be better, instead of having isPure
, being more
barebones. There are a lot of attributes a function can have. isClosure
might be useful, etc.
ger than the callback, that might also be safe for not wrapping in the membrane. I'm thinking of a DOM NodeFilter object where the acceptNode method modifies the filter, but the filter is defined via a let statement in a small scope statement-block. Since I don't know what to call this kind of function, I'll temporarily call it "pure within a scope" until someone corrects me. What I mean by "pure within a scope" is that the function's only side effects involve objects within that set.
So… is the following function pure?
let env = 1; function tryMe() { if (false) { env = 2; } }
Only extremely small subset of functions can be proven to be pure. And I suppose that these functions are already optimized by engines.
eg. notPure = (a,b) => a + b; // implicit conversion with side effects can
happen notPure = (a) => a && a.b; // getter can be called
notPure = (foo, bar) => Reflect.has(foo, bar); // proxy trap can be called.
Someone could overwrite Reflect.has
etc.
It would be better idea to have builtin decorator @pure that allow engine to remove or reorganize function calls (and valid implementation can treat it as no-op).
Relevant discussions:
< groups.google.com/d/msg/mozilla.dev.tech.js-engine/aSKg4LujHuM/2Y9ORBwCIQAJ
and:
OK, OK, clearly, pure functions are a fantasy in JavaScript. Once again, a great idea and a lengthy essay of a post falls to a real-world situation. Nuts.
What if there's a middle ground? In the case I was originally talking about, the developer calling on my membrane controls how the callback function executes. I just want to ensure that, when the callback is passed controlled and trusted arguments (including possibly "this"), the function doesn't refer to anything outside those arguments and local variables derived from them. Is that reasonable to ask for?
Call this check "is relatively pure", or "is locally pure", if you desire. If the inputs themselves manipulate globals, "absolutely pure" can never be guaranteed, as was just demonstrated...
Seems like you're asking for a check for if the function is a closure. If it maintains references to variables outside the scope of the function.
So would an isClosure
function work for at least a little of what you
want?
On Thu, Dec 7, 2017 at 1:11 PM, Alex Vincent <ajvincent at gmail.com> wrote:
OK, OK, clearly, pure functions are a fantasy in JavaScript. Once again,
a
great idea and a lengthy essay of a post falls to a real-world situation. Nuts.
What if there's a middle ground? In the case I was originally talking about, the developer calling on my membrane controls how the callback function executes. I just want to ensure that, when the callback is
passed
controlled and trusted arguments (including possibly "this"), the function doesn't refer to anything outside those arguments and local variables derived from them. Is that reasonable to ask for?
Do you want the same kind of thing that Joe-E provides for Java? www.cs.berkeley.edu/~daw/papers/pure-ccs08.pdf
Proving that particular methods within a code base are functionally pure—deterministic and side-effect free—would aid verification of security properties including function invertibility, reproducibility of computation, and safety of untrusted code execution. Until now it has not been possible to automatically prove a method is functionally pure within a high-level imperative language in wide use, such as Java.
Nothing that is well established really fits for JS, but there are some proposals and experimental features.
The frozen realms proposal tc39/proposal-frozen-realms might
also be of interest.
Though initially separate, realms can be brought into intimate contact with each other via host-provided APIs.
In a browser setting, WebAssembly webassembly.org/docs/js/#webassemblyinstance-constructor might be
your best bet. WebAssembly.instantiate(module,importsObj) runs a sandboxed binary that has access to functions and values in an importsObject, and takes care to prevent following backwards edges in the object graph (proto, .constructor and the like).
This led down a long, rambling path [1] where I then realized: if the callback function is a pure function, then for the purposes of that callback, the arguments probably do not need to be wrapped in proxies at all. The big catch is that the callback can't store any of the arguments in variables external to the callback (a classic side effect). If the function really is pure, though, I can avoid the memory leak.
-1 users generally should not be encouraged to micro-manage/optimize the gc in javascript (javascript is not c). that is the responsibility of engine-implementers. making users think they can outsmart the gc (which they don't) will only encourage them to waste time writing “high-performance” code that is bike-shedding at best and tech-debt at worst (and my suspicion is membrane/proxy will suffer either of those fates with this feature).
TLDR: -1 is much more useful in mathematics than in expressing an opinion.
Mr. Zhu,
In this case I am the implementer, and I generally know what I'm doing after well over fifteen years of working with the language, including the membrane library we're talking about, and including writing an 1100+ page book on the language. (The latter taught me a lot, by the way. You should try it sometime. It's a very humbling experience, both at the time for what you thought you knew that's actually quite wrong, and much later in life, when you realize what you still got wrong. I only hope Martin Honnen is still around for me to reiterate my thanks to...)
I shouldn't be worried about the garbage collector, and I'm not.
But I do know how my membrane can (a) create and (b) hold onto references. This wasn't about the garbage collector at all. This was about avoiding creating the proxy and thus, the reference, in the first place. I am dealing with the concept of "when is wrapping something in the membrane counter-productive in some way".
I said "essentially a memory leak", not meaning to be taken entirely literally. It is a memory leak in the sense that unless somebody visits that proxy again -- which no one will ever know if it's necessary or not -- the reference to the proxy will remain alive. For that proxy to exist because of a simple filter function is possibly less than ideal, to put it politely. Especially when said filter function can be called many times for many different objects.
So yes, maybe it's premature optimization, and I'm always willing to admit I'm wrong... when it's clear that I am. In my case, I'm not looking at micromanagement of the gc, but to avoid certain work in the first place.
Now, parsing a function's source code to find out if that function will, say, set a global variable that it shouldn't, is probably pretty expensive as well... except that the JavaScript engine has already done that parsing and might have retained some useful metadata that the specification doesn't expose. Which is why I'm not particularly interested in adding esprima and a bunch of code I don't understand for a small bit of metadata. It's why I came to this mailing list today, to ask the experts, "Hey, is this a good idea to look into? I actually have a use-case, and it's not something that can be easily done in raw JS." That's my understanding of the general requirements for adding new API.
I already explored at least two other ideas that turned into dead ends, and it's quite possible (probable, given the feedback) that this one will be a dead end too. I'm fine with that.
So sure, asking end users to micromanage or optimize their code is irresponsible. But I'm writing a library for developers to use, and if I can provide advice on the best way to use that library, I think it's worth the time to think about how that library works, and how to be nice to the machines that it will run on. So I'd better think a lot about optimizing *my *code, and about helping my users get the best reactions from it.
apologies. reading the github issue more closely, i'm not an expert on this matter.
TLDR: I'm asking for the group to consider adding a Function.isPure(func) or isPureWithinScope() to the ECMAScript specification, to help code detect pure functions.
This morning, I had a thought experiment about membranes, and in particular, callback functions passed into them. If the callback function is called repeatedly for objects that will not be needed again, that means one extra proxy for each object, which the membrane must remember. That's effectively a memory leak.
This led down a long, rambling path [1] where I then realized: if the callback function is a pure function, then for the purposes of that callback, the arguments probably do not need to be wrapped in proxies at all. The big catch is that the callback can't store any of the arguments in variables external to the callback (a classic side effect). If the function really is pure, though, I can avoid the memory leak.
But how can I tell from JavaScript, easily, whether a function is pure or not?
For reference, a quick search on es-discuss found [2], which is probably worth re-reading.
Determining whether a function is pure or not (or "only uses these objects") is probably best left up to JavaScript parsers. Although it could be done using esprima, I suppose, that's a lot to ask of a membrane.
So the first question I have is, would this group actively consider adding a built-in Function.isPure(func) method to the ECMAScript language? (Or Function.prototype.isPure(), but I prefer the former.)
Further, in the case where a function isn't pure but its side effects are confined to local variables that won't live longer than the callback, that might also be safe for not wrapping in the membrane. I'm thinking of a DOM NodeFilter object where the acceptNode method modifies the filter, but the filter is defined via a let statement in a small scope statement-block. Since I don't know what to call this kind of function, I'll temporarily call it "pure within a scope" until someone corrects me. What I mean by "pure within a scope" is that the function's only side effects involve objects within that set.
The second question then is, would this group actively consider a built-in function which takes a target function to test as its first argument and an array of objects that are considered safe for that target function to work with, and returns true if the target function has no side effects besides directly impacting the objects in the array? (No, I don't know what to call the built-in function. Maybe Function.isPureWithinScope(func, safeObjectsArray).)
Alex Vincent Hayward, CA, U.S.A.
[1] ajvincent/es7-membrane#141 [2] esdiscuss.org/topic/pure