Explict Memory Managment

# David Semeria (16 years ago)

first post, so please excuse any clumsiness.

As a developer of large scale in-browser web apps, I frequently face memory management issues. These issues apply to both managing DOM elements (which I appreciate is beyond the scope of ECMA) but also to managing objects within the script context.

I also appreciate that memory management is implementation specific, but I would also assume that any direct calls to the memory management system would be part of the language syntax.

I believe the problems I am faced with are common to many other developers, and, as web applications become more complex, they will become increasingly prevalent.

Basically I would like to suggest three methods: obj.kill() obj.swapOut() obj.swapIn()

These are pretty self-explanatory.

obj.kill() would erase the object and immediately release any memory used by the object

obj.swapOut() would swap out and release any memory used by the obj

obj.swapIn() would bring the object back.

References to swapped-out objects would automatically trigger their swapIn() method.

The necessity for these last two methods arises when the application has downloaded a large quantity of information which it does not currently need – but may well do in the future. The built-in garbage collector has no way of knowing the space occupied by these objects could be temporarily freed-up.

It makes more sense to 'store' this information locally as a memory map than either using some local storage service (database, gears etc) if available or, as developers are currently forced to do, de-reference the object and re-request it at again later (incurring both a latency and bandwidth cost).

The requirement for obj.kill() arises when the developer/application knows for sure that a given object is no longer required. It's true that if the object has been completely de-referenced it will be picked-up (eventually) by the garbage collector, but it would be nice to be able to force this behavior.

I hope the above makes some sense, and, apart from re-emphasizing the practical utility of the above suggestions, I think I'll leave it there.

Thanks,

David Semeria

# Ash Berlin (16 years ago)

On 22 May 2009, at 16:21, David Semeria wrote:

Hi, first post, so please excuse any clumsiness.

As a developer of large scale in-browser web apps, I frequently face memory management issues. These issues apply to both managing DOM elements (which I appreciate is beyond the scope of ECMA) but also to managing objects within the script context.

I also appreciate that memory management is implementation specific,
but I would also assume that any direct calls to the memory management system would be part of the language syntax.

I believe the problems I am faced with are common to many other developers, and, as web applications become more complex, they will become increasingly prevalent.

Basically I would like to suggest three methods: obj.kill() obj.swapOut() obj.swapIn()

These are pretty self-explanatory.

obj.kill() would erase the object and immediately release any memory used by the object

obj.swapOut() would swap out and release any memory used by the obj

obj.swapIn() would bring the object back.

References to swapped-out objects would automatically trigger their swapIn() method.

The necessity for these last two methods arises when the application
has downloaded a large quantity of information which it does not currently need – but may well do in the future. The built-in garbage collector
has no way of knowing the space occupied by these objects could be temporarily freed-up.

It makes more sense to 'store' this information locally as a memory
map than either using some local storage service (database, gears etc) if available or, as developers are currently forced to do, de-reference
the object and re-request it at again later (incurring both a latency and bandwidth cost).

The requirement for obj.kill() arises when the developer/application knows for sure that a given object is no longer required. It's true
that if the object has been completely de-referenced it will be picked-up (eventually) by the garbage collector, but it would be nice to be able to force this behavior.

I hope the above makes some sense, and, apart from re-emphasizing the practical utility of the above suggestions, I think I'll leave it
there.

Thanks,

David Semeria

I appreciate the need, but the question arises:

swapOut to where? To some space that will still use memory, so doesn't
actually help....?

Also obj.kill() would be almost possible to actually implement since
what would happen if you happen to kill an object that still has
references somewhere?

What you actually want is just a gc() method avilable somewhere that
will do a normal GC run, just on demand isn't it? This is certainly
more likely to happen.

I guess obj.kill() could just see if the object is reachable and if it
still is then do nothing, or throw, but still.

# Brendan Eich (16 years ago)

As Ash points out, there's no memory-safe way to implement obj.kill()
short of a full GC.

There is no way JS will lose memory safety, ever!

What's more, we shouldn't expose a gc() call. It would only be abused
over time. Even if it were used well at first for a given browser
engine (or version), the tendency over time would be for it to be
called too much, and for too many (or all) user agents. It would be
symptom-treating at best.

Explicit memory management is sometimes important, e.g. using a
preallocated array or native buffer when pushing pixels. This is not a
case where gc() or obj.kill() helps, though.

If you find poor GC performance, file a bug with the browser vendor or
open source project. If you are using too much memory, reifying large
external data sets in JS, don't do that. There's usually a way to
avoid taking the all-in-memory hit. SAX-style parsing rather than all- or-nothing blocking parse calls, etc.

I hate to generalize, but on the other hand the post asking for
explicit memory management was short on specific details -- including
ones that might lead to specific browser bugs that could be fixed to
the benefit of all developers.

# David Semeria (16 years ago)

On Fri, 2009-05-22 at 18:34 +0100, Ash Berlin wrote:

I appreciate the need, but the question arises:

swapOut to where? To some space that will still use memory, so doesn't
actually help....?

Swap out to disk - sorry I thought that was implicit

Also obj.kill() would be almost [im]possible to actually implement since
what would happen if you happen to kill an object that still has
references somewhere?

I would say the killed object would just equate to 'null' If a programmer kills an object and then tries to reference it, then that's his/her problem

What you actually want is just a gc() method avilable somewhere that
will do a normal GC run, just on demand isn't it? This is certainly
more likely to happen.

Actually, no.

I would assume the work involved in going through every DOM and script object and working out whether they are islands is non-trivial, whereas de-allocating the actual memory for the resulting object list would be.

obj.kill() would actually do the GC a favour, by explcitly referencing an object which could be deleted. I don't see a huge issue with 'dangling' references to deleted objects.

In fact, these dangling references are frequently hard to spot, and can lead to major leakage over time.

I know exactly which objects I don't need any more, and I would rather be able to explictly kill them then make a generic (and expensive) GC call which may not even guarantee those items will be zapped.

# Brendan Eich (16 years ago)

On May 22, 2009, at 1:05 PM, David Semeria wrote:

Swap out to disk - sorry I thought that was implicit

There's no disk on some devices. Anyway, that's up to the OS.

Advisory calls could help but we need some real examples, or better:
evidence from real apps, before inventing advice APIs here.

Also obj.kill() would be almost [im]possible to actually implement
since what would happen if you happen to kill an object that still has references somewhere?

I would say the killed object would just equate to 'null'

Then set obj = null.

If a programmer kills an object and then tries to reference it, then that's his/her problem

No. You're ignoring the possibility of other references:

obj = frobj = gobj = new Object; ...; obj.kill()

How what happens when frobj or gobj is used?

I would assume the work involved in going through every DOM and script object and working out whether they are islands is non-trivial,
whereas de-allocating the actual memory for the resulting object list would
be.

See above -- you can't assume there are no other references to that
memory. Each variable or property that can be assigned anywhere that
could connect to obj (the object, not the reference) could save a ref.

# David Semeria (16 years ago)

On Fri, 2009-05-22 at 11:46 -0700, Brendan Eich wrote:

As Ash points out, there's no memory-safe way to implement obj.kill()
short of a full GC.

There is no way JS will lose memory safety, ever!

Why can't the deleted object still exist, with its value set to 'null'?

What's more, we shouldn't expose a gc() call. It would only be abused
over time. Even if it were used well at first for a given browser
engine (or version), the tendency over time would be for it to be
called too much, and for too many (or all) user agents. It would be
symptom-treating at best.

I never suggested the GC be exposed

Explicit memory management is sometimes important, e.g. using a
preallocated array or native buffer when pushing pixels. This is not a
case where gc() or obj.kill() helps, though.

If you find poor GC performance, file a bug with the browser vendor or
open source project.

I'm not suggesting GC is not well-implemented. I'm suggesting there could be a case for the programmer giving the GC a hand.

If you are using too much memory, reifying large
external data sets in JS, don't do that. There's usually a way to
avoid taking the all-in-memory hit. SAX-style parsing rather than all- or-nothing blocking parse calls, etc.

This goes back to my initial point regarding complex web apps. There are many intances where it makes sense to store content in js objects and then flow it into the DOM using a given template. If the user changes the template then the information is simply reflowed.

I hate to generalize, but on the other hand the post asking for
explicit memory management was short on specific details -- including
ones that might lead to specific browser bugs that could be fixed to
the benefit of all developers.

Again, I'm not implying the presence of bugs. I'm suggesting more complex use cases can lead to situations in which the exiting GC scheme may not be optimal.

I intentionally didn't go into specifics, but I can provide some real world examples if you think they would be helpful.

# Brendan Eich (16 years ago)

On May 22, 2009, at 1:27 PM, David Semeria wrote:

On Fri, 2009-05-22 at 11:46 -0700, Brendan Eich wrote:

As Ash points out, there's no memory-safe way to implement obj.kill() short of a full GC.

There is no way JS will lose memory safety, ever!

Why can't the deleted object still exist, with its value set to
'null'?

First, null is not a "value" in the sense of object value. An object
takes up arbitrary space associating property names with values.
You're possibly confusing reference and referent again.

But say you safely "nulled" all the memory associated with the object,
making it equivalent to ({}) but preserving its identity. Then you
haven't necessarily saved much memory (it could be small with only
primitive values) and you still will booby-trap other code that saved
aliasing references.

If you want kill that works on user-defined objects, you can write it
now:

function kill(obj) { for (var i in obj) delete obj[i]; }

But you'd do better to set obj = null to kill the reference, instead
of wasting VM cycles deleting enumerable properties but leaving the
empty object alive.

If you are using too much memory, reifying large external data sets in JS, don't do that. There's usually a way to avoid taking the all-in-memory hit. SAX-style parsing rather than
all- or-nothing blocking parse calls, etc.

This goes back to my initial point regarding complex web apps. There
are many intances where it makes sense to store content in js objects and then flow it into the DOM using a given template. If the user changes the template then the information is simply reflowed.

What's the problem? GC runs sooner or later. If you see a GC bug,
please report it.

Yes, GC uses more memory on average than if you gave explicit advice
that was reliable. But the advice can't be trusted without walking the
object graph.

I intentionally didn't go into specifics, but I can provide some real world examples if you think they would be helpful.

Sure.

# Michael Haufe (16 years ago)

David Semeria Wrote:

Why can't the deleted object still exist, with its value set to 'null'?

If you have some large data structure in memory you need to temporarily remove, why not serialize it and stick in client storage? I believe every major browser back to IE 5.5 supports this natively in some form I believe.

# David Semeria (16 years ago)

On Fri, 2009-05-22 at 13:37 -0700, Brendan Eich wrote:

On May 22, 2009, at 1:27 PM, David Semeria wrote:

On Fri, 2009-05-22 at 11:46 -0700, Brendan Eich wrote:

As Ash points out, there's no memory-safe way to implement obj.kill() short of a full GC.

There is no way JS will lose memory safety, ever!

Why can't the deleted object still exist, with its value set to
'null'?

First, null is not a "value" in the sense of object value. An object
takes up arbitrary space associating property names with values.
You're possibly confusing reference and referent again.

That's very likely - I have only a very rudimentary idea of how the language is implemented in the browser.

Yes, GC uses more memory on average than if you gave explicit advice
that was reliable. But the advice can't be trusted without walking the
object graph.

Ok. I assume object references are implemnted bi-directionally, otherwise the GC would take a lifetime to run. If that's the case couldn't obj.kill() check whether the object was killable, and throw if it isn't. That would be a massive debugging aid.

I intentionally didn't go into specifics, but I can provide some real world examples if you think they would be helpful.

Sure.

I'll do that then. It'll take a while to prepare.

# Mike Shaver (16 years ago)

On Fri, May 22, 2009 at 4:53 PM, David Semeria <david at lmframework.com> wrote:

Ok. I assume object references are implemnted bi-directionally, otherwise the GC would take a lifetime to run.

I don't know of any that are implemented bidirectionally, since it would be a waste of space; it's certainly not required for any fast GC I know of.

Mike

# Brendan Eich (16 years ago)

On May 22, 2009, at 1:53 PM, David Semeria wrote:

First, null is not a "value" in the sense of object value. An object takes up arbitrary space associating property names with values. You're possibly confusing reference and referent again.

That's very likely - I have only a very rudimentary idea of how the language is implemented in the browser.

It's not only an implementation issue. The language has reference
types -- objects -- and mutation. Sometimes you must know about how
more than one reference to an object being modified can affect the
different reference holders, even if you are a language user only and
not a language implementor.

# Brendan Eich (16 years ago)

On May 22, 2009, at 2:06 PM, Brendan Eich wrote:

On May 22, 2009, at 1:53 PM, David Semeria wrote:

First, null is not a "value" in the sense of object value. An object takes up arbitrary space associating property names with values. You're possibly confusing reference and referent again.

That's very likely - I have only a very rudimentary idea of how the language is implemented in the browser.

It's not only an implementation issue. The language has reference
types -- objects -- and mutation. Sometimes you must know about how
more than one reference to an object being modified can affect the
different reference holders, even if you are a language user only
and not a language implementor.

The simplest case:

var obj = {hi: "there"}; var frobj = obj; // did we copy obj? frobj.hi = "bye"; alert(obj.hi); // no, we copied the reference

Notice that string is apparently a value type. You can't mutate
primitive strings, so implementations can and do use references to
shared immutable strings under the hood (even to shared mutables, with
reference-counting or other tricks to avoid exposing the optimization)
under the hood. Of coure number, boolean, and the hidden types of null
and undefined are value types.

Because object is a reference type, people sometimes want something like

function copy(obj) { var clone = {}; for (var i in obj) clone[i] = obj[i]; return clone; }

This assumes obj is an Object instance, without getters or setters,
etc. See the Object.extend thread for more:

esdiscuss/2008-July/006709

and jresig's super-duper implementation at

ejohn.org/files/object-extend.js

This could use ES5's Object meta-programming methods instead of
lookupProperty, but then it wouldn't work in extant browsers.

# David Semeria (16 years ago)

On Fri, 2009-05-22 at 14:13 -0700, Brendan Eich wrote:

Because object is a reference type, people sometimes want something like

function copy(obj) { var clone = {}; for (var i in obj) clone[i] = obj[i]; return clone; }

Yeah, I had a lot of fun implementing a generic multi-type deep clone myself. I think I understand the concepts, I just don't know their proper names (mutable?).

Anyway, here is real-world example as promised:


First of all, thanks for everyone's tolerance for a chap who's on far from solid ground here.

I really wanted to avoid referencing my own work, but it seems I don't have much choice, so here goes:

The description below refers to a demo app. which was built using a framework of my own creation. The app functions as a Twitter client, and captures many of the issues I've been refering to.

The application downloads new tweets at regular intervals and adds them to a resevoir which is implemented as a fixed-length FIFO queue. After every update a variable number of streams are re-populated from the resevoir using user-defined filters. For reasons which would take too long to explain both the resevoir and streams are implemented as simple objects, which function as hash tables.

Hence:

+++++++

  • +  --> [F1] -> [strm 1] 
    
  • +  --> [F2] -> [strm 2]
    
  • RES + --> [F3] -> [strm 3]
  • +  ...
    
  • +  ...
    
  • +  --> [Fn] -> [strm n]
    
  • +
    
  • +
    

+++++++

Notes

The streams do not contain copies of the tweets, but references to items in the resevoir. The size of the resevoir can be varied, but I generally use 5,000 items The streams are stored as objects because the user can flow and reflow them using different HTML templates All of the structures in the framework use 1:1 references which always point down.

Operation

  1. To make sure the new tweets appear at the top of the global resevoir object, I need to create a new object, add the new tweets in chronological order, then copy over the old contents of the resevoir upto the resevoir's max length. I then assign the new object to the global resevoir object.

  2. To populate the streams, I set each stream object to {} and then cycle through the new resevoir and add any objects (tweets) that pass the relevant filter to the stream object.

Issues

When left to its own devices, the application (only tested under FF so far) will generally eat up memory, reaching over 1GB if left overnight. Whilst it's possible that there is a mistake in my code, I can't understand where it could possibly be. I'm not creating loads of new object defintions, just overwriting old ones.

But this is my point: before overwriting the old resevoir I could save the GC a lot of trouble and kill it manually. The same goes for each stream before it is overwritten.

Notes

It is also possible that the memory leaks (especially considering the size) are coming from the DOM, since the same principle applies: the nodes containing the old tweets are separated from their parentNode, and the new tweets are flowed under the parentNode. Again, I could tell the GC that those orphaned elements could be safely deleted.

I would just like to emphasize that the issue is not why the system is leaking - it's that if I could kill the objects which I know are no longer needed I could make the GC's job easier.

If anyone is really interested, here is a video of the system in operation: lmframework.com/page.php?id=vd_twig_short_1

The video is potentially relevant because there is the additional issue of windows (some of which have quite large objects associated with them) being opened and hidden. When they are hidden I don't destroy them because it seems unnecessary. This is the context in which I suggested obj.swapOut() and obj.swapIn().

I hope this info now makes the reasons for my suggestions a bit clearer,

David Semeria

# David Semeria (16 years ago)

On Fri, 2009-05-22 at 15:46 -0700, Brendan Eich wrote:

What version of Firefox? Have you tried 3.5 beta 4?

What about other browsers? Same or similar memory growth overnight?

The 1GB figure refers to tests I did a while back using IceWeasel (which is pretty old) under Debian Etch.

If you're interested I could run it overnight on FF 3.0.10 XP/64

Since this is only a demo the actual leakage is not a big deal for me, but future applications under the framework will be even larger and so the whole memory management issue is.

D.