Weak Graph

# Jussi Kalliokoski (10 years ago)

Usually my first assumption when I think I need weak references is that I should use WeakMaps instead. Today I came across an interesting use case and was wrong for the first time. However it wasn't weak references I needed either.

I'm trying to come up with a solution to the problem of rendering lists that's often used as a counter argument for using a framework / view library instead of direct DOM manipulation, where - even with React - updating (even appending to) a list is O(n) at best.

My idea for a solution is that the lists are immutable, contain a reference to their parent and a changeset / diff compared to their parent. This would allow rendering the whole list initally, then just applying the diffs on subsequent renders by walking the graph up to the last known ancestor and combining the changesets. This would make subsequent renders O(1) which is a great improvement even with small lists.

The biggest problem is that this will leak memory like crazy; every revision of the list will be preserved. Let's say we have the following implementation:

function WeakGraph () {
    const parentByNode = new WeakMap();

    this.getLineage = function (node, ancestor) {
        const lineage = [];

        let currentNode = node;
        do {
            lineage.push(currentNode);
            if ( !parentByNode.has(currentNode) ) { throw new Error("node
is not a descendant of ancestor"); }
            currentNode = parentByNode.get(currentNode);
        } while ( currentNode !== ancestor );

        return lineage;
    };

    this.addNode = function (node, parent) {
        parentByNode.set(node, parent);
    };
}

It provides the needed interface and the unused child revisions get cleaned up properly. However:

This is a complete nightmare for GC performance because of cyclical weak references.
Any reference to a child will maintain references to all its parents.

However this doesn't necessarily need to be the case because the stored ancestry is not observable to anything that creates a WeakGraph, except to the oldest ancestor that has a reference elsewhere.

I'm not sure if this use case alone warrants adding a new feature to the language, or if I'm just missing something and it can be implemented with existing constructs or if there should be some other lower level primitive that would allow building a WeakGraph on the user level.

Usually my first assumption when I think I need weak references is that I
should use WeakMaps instead. Today I came across an interesting use case
and was wrong for the first time. However it wasn't weak references I
needed either.

I'm trying to come up with a solution to the problem of rendering lists
that's often used as a counter argument for using a framework / view
library instead of direct DOM manipulation, where - even with React -
updating (even appending to) a list is O(n) at best.

My idea for a solution is that the lists are immutable, contain a reference
to their parent and a changeset / diff compared to their parent. This would
allow rendering the whole list initally, then just applying the diffs on
subsequent renders by walking the graph up to the last known ancestor and
combining the changesets. This would make subsequent renders O(1) which is
a great improvement even with small lists.

The biggest problem is that this will leak memory like crazy; every
revision of the list will be preserved. Let's say we have the following
implementation:

```JS
function WeakGraph () {
    const parentByNode = new WeakMap();

    this.getLineage = function (node, ancestor) {
        const lineage = [];

        let currentNode = node;
        do {
            lineage.push(currentNode);
            if ( !parentByNode.has(currentNode) ) { throw new Error("node
is not a descendant of ancestor"); }
            currentNode = parentByNode.get(currentNode);
        } while ( currentNode !== ancestor );

        return lineage;
    };

    this.addNode = function (node, parent) {
        parentByNode.set(node, parent);
    };
}
```

It provides the needed interface and the unused child revisions get cleaned
up properly. However:

* This is a complete nightmare for GC performance because of cyclical weak
references.
* Any reference to a child will maintain references to all its parents.

However this doesn't necessarily need to be the case because the stored
ancestry is not observable to anything that creates a WeakGraph, except to
the oldest ancestor that has a reference elsewhere.

I'm not sure if this use case alone warrants adding a new feature to the
language, or if I'm just missing something and it can be implemented with
existing constructs or if there should be some other lower level primitive
that would allow building a WeakGraph on the user level.

- Jussi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151104/3b3031a3/attachment.html>

# Isiah Meadows (10 years ago)

Would this be possible with a mixture of weak references and weak collections?

Would this be possible with a mixture of weak references and weak
collections?

On Wed, Nov 4, 2015, 11:09 Jussi Kalliokoski <jussi.kalliokoski at gmail.com>
wrote:

> Usually my first assumption when I think I need weak references is that I
> should use WeakMaps instead. Today I came across an interesting use case
> and was wrong for the first time. However it wasn't weak references I
> needed either.
>
> I'm trying to come up with a solution to the problem of rendering lists
> that's often used as a counter argument for using a framework / view
> library instead of direct DOM manipulation, where - even with React -
> updating (even appending to) a list is O(n) at best.
>
> My idea for a solution is that the lists are immutable, contain a
> reference to their parent and a changeset / diff compared to their parent.
> This would allow rendering the whole list initally, then just applying the
> diffs on subsequent renders by walking the graph up to the last known
> ancestor and combining the changesets. This would make subsequent renders
> O(1) which is a great improvement even with small lists.
>
> The biggest problem is that this will leak memory like crazy; every
> revision of the list will be preserved. Let's say we have the following
> implementation:
>
> ```JS
> function WeakGraph () {
>     const parentByNode = new WeakMap();
>
>     this.getLineage = function (node, ancestor) {
>         const lineage = [];
>
>         let currentNode = node;
>         do {
>             lineage.push(currentNode);
>             if ( !parentByNode.has(currentNode) ) { throw new Error("node
> is not a descendant of ancestor"); }
>             currentNode = parentByNode.get(currentNode);
>         } while ( currentNode !== ancestor );
>
>         return lineage;
>     };
>
>     this.addNode = function (node, parent) {
>         parentByNode.set(node, parent);
>     };
> }
> ```
>
> It provides the needed interface and the unused child revisions get
> cleaned up properly. However:
>
> * This is a complete nightmare for GC performance because of cyclical weak
> references.
> * Any reference to a child will maintain references to all its parents.
>
> However this doesn't necessarily need to be the case because the stored
> ancestry is not observable to anything that creates a WeakGraph, except to
> the oldest ancestor that has a reference elsewhere.
>
> I'm not sure if this use case alone warrants adding a new feature to the
> language, or if I'm just missing something and it can be implemented with
> existing constructs or if there should be some other lower level primitive
> that would allow building a WeakGraph on the user level.
>
> - Jussi
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151104/11983ebe/attachment-0001.html>

# Jussi Kalliokoski (10 years ago)

On Wed, Nov 4, 2015 at 6:19 PM, Isiah Meadows <isiahmeadows at gmail.com>

wrote:

Would this be possible with a mixture of weak references and weak collections?

I don't think so - the only potential implementation I can think of would make a weak reference to the parent which would allow the parent to be GCed, but then all the nodes between the latest revision and the ancestor would require strong references somewhere in order to be maintained, making the whole thing kinda pointless because you couldn't rely on it working. Also this use case doesn't require making GC observable, which I think is pretty much the only feature of weak references that WeakMaps don't provide.

On Wed, Nov 4, 2015 at 6:19 PM, Isiah Meadows <isiahmeadows at gmail.com>
wrote:

> Would this be possible with a mixture of weak references and weak
> collections?
>

I don't think so - the only potential implementation I can think of would
make a weak reference to the parent which would allow the parent to be
GCed, but then all the nodes between the latest revision and the ancestor
would require strong references somewhere in order to be maintained, making
the whole thing kinda pointless because you couldn't rely on it working.
Also this use case doesn't require making GC observable, which I think is
pretty much the only feature of weak references that WeakMaps don't provide.

- Jussi


>
> On Wed, Nov 4, 2015, 11:09 Jussi Kalliokoski <jussi.kalliokoski at gmail.com>
> wrote:
>
>> Usually my first assumption when I think I need weak references is that I
>> should use WeakMaps instead. Today I came across an interesting use case
>> and was wrong for the first time. However it wasn't weak references I
>> needed either.
>>
>> I'm trying to come up with a solution to the problem of rendering lists
>> that's often used as a counter argument for using a framework / view
>> library instead of direct DOM manipulation, where - even with React -
>> updating (even appending to) a list is O(n) at best.
>>
>> My idea for a solution is that the lists are immutable, contain a
>> reference to their parent and a changeset / diff compared to their parent.
>> This would allow rendering the whole list initally, then just applying the
>> diffs on subsequent renders by walking the graph up to the last known
>> ancestor and combining the changesets. This would make subsequent renders
>> O(1) which is a great improvement even with small lists.
>>
>> The biggest problem is that this will leak memory like crazy; every
>> revision of the list will be preserved. Let's say we have the following
>> implementation:
>>
>> ```JS
>> function WeakGraph () {
>>     const parentByNode = new WeakMap();
>>
>>     this.getLineage = function (node, ancestor) {
>>         const lineage = [];
>>
>>         let currentNode = node;
>>         do {
>>             lineage.push(currentNode);
>>             if ( !parentByNode.has(currentNode) ) { throw new Error("node
>> is not a descendant of ancestor"); }
>>             currentNode = parentByNode.get(currentNode);
>>         } while ( currentNode !== ancestor );
>>
>>         return lineage;
>>     };
>>
>>     this.addNode = function (node, parent) {
>>         parentByNode.set(node, parent);
>>     };
>> }
>> ```
>>
>> It provides the needed interface and the unused child revisions get
>> cleaned up properly. However:
>>
>> * This is a complete nightmare for GC performance because of cyclical
>> weak references.
>> * Any reference to a child will maintain references to all its parents.
>>
>> However this doesn't necessarily need to be the case because the stored
>> ancestry is not observable to anything that creates a WeakGraph, except to
>> the oldest ancestor that has a reference elsewhere.
>>
>> I'm not sure if this use case alone warrants adding a new feature to the
>> language, or if I'm just missing something and it can be implemented with
>> existing constructs or if there should be some other lower level primitive
>> that would allow building a WeakGraph on the user level.
>>
>> - Jussi
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151104/634ecd98/attachment.html>

# Isiah Meadows (10 years ago)

I don't think that's possible with traditional GC. If you can find an existing implementation in a GC language (like Java) that has the behavior you want, I'll be interested. Otherwise, the only way is explicit garbage collection.

I don't think that's possible with traditional GC. If you can find an
existing implementation in a GC language (like Java) that has the behavior
you want, I'll be interested. Otherwise, the only way is explicit garbage
collection.

On Wed, Nov 4, 2015, 11:34 Jussi Kalliokoski <jussi.kalliokoski at gmail.com>
wrote:

> On Wed, Nov 4, 2015 at 6:19 PM, Isiah Meadows <isiahmeadows at gmail.com>
> wrote:
>
>> Would this be possible with a mixture of weak references and weak
>> collections?
>>
>
> I don't think so - the only potential implementation I can think of would
> make a weak reference to the parent which would allow the parent to be
> GCed, but then all the nodes between the latest revision and the ancestor
> would require strong references somewhere in order to be maintained, making
> the whole thing kinda pointless because you couldn't rely on it working.
> Also this use case doesn't require making GC observable, which I think is
> pretty much the only feature of weak references that WeakMaps don't provide.
>
> - Jussi
>
>
>>
>> On Wed, Nov 4, 2015, 11:09 Jussi Kalliokoski <jussi.kalliokoski at gmail.com>
>> wrote:
>>
>>> Usually my first assumption when I think I need weak references is that
>>> I should use WeakMaps instead. Today I came across an interesting use case
>>> and was wrong for the first time. However it wasn't weak references I
>>> needed either.
>>>
>>> I'm trying to come up with a solution to the problem of rendering lists
>>> that's often used as a counter argument for using a framework / view
>>> library instead of direct DOM manipulation, where - even with React -
>>> updating (even appending to) a list is O(n) at best.
>>>
>>> My idea for a solution is that the lists are immutable, contain a
>>> reference to their parent and a changeset / diff compared to their parent.
>>> This would allow rendering the whole list initally, then just applying the
>>> diffs on subsequent renders by walking the graph up to the last known
>>> ancestor and combining the changesets. This would make subsequent renders
>>> O(1) which is a great improvement even with small lists.
>>>
>>> The biggest problem is that this will leak memory like crazy; every
>>> revision of the list will be preserved. Let's say we have the following
>>> implementation:
>>>
>>> ```JS
>>> function WeakGraph () {
>>>     const parentByNode = new WeakMap();
>>>
>>>     this.getLineage = function (node, ancestor) {
>>>         const lineage = [];
>>>
>>>         let currentNode = node;
>>>         do {
>>>             lineage.push(currentNode);
>>>             if ( !parentByNode.has(currentNode) ) { throw new
>>> Error("node is not a descendant of ancestor"); }
>>>             currentNode = parentByNode.get(currentNode);
>>>         } while ( currentNode !== ancestor );
>>>
>>>         return lineage;
>>>     };
>>>
>>>     this.addNode = function (node, parent) {
>>>         parentByNode.set(node, parent);
>>>     };
>>> }
>>> ```
>>>
>>> It provides the needed interface and the unused child revisions get
>>> cleaned up properly. However:
>>>
>>> * This is a complete nightmare for GC performance because of cyclical
>>> weak references.
>>> * Any reference to a child will maintain references to all its parents.
>>>
>>> However this doesn't necessarily need to be the case because the stored
>>> ancestry is not observable to anything that creates a WeakGraph, except to
>>> the oldest ancestor that has a reference elsewhere.
>>>
>>> I'm not sure if this use case alone warrants adding a new feature to the
>>> language, or if I'm just missing something and it can be implemented with
>>> existing constructs or if there should be some other lower level primitive
>>> that would allow building a WeakGraph on the user level.
>>>
>>> - Jussi
>>> _______________________________________________
>>> es-discuss mailing list
>>> es-discuss at mozilla.org
>>> https://mail.mozilla.org/listinfo/es-discuss
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151105/bee55d61/attachment.html>

# Jason Orendorff (10 years ago)

On Wed, Nov 4, 2015 at 10:09 AM, Jussi Kalliokoski <jussi.kalliokoski at gmail.com> wrote:

I'm trying to come up with a solution to the problem of rendering lists [...] My idea for a solution is that the lists are immutable, contain a reference to their parent and a changeset / diff compared to their parent. [...]

Good problem, interesting idea.

The biggest problem is that this will leak memory like crazy; every revision of the list will be preserved.

OK. Perhaps obviously, the only way around this is to mutate the list, breaking the chain at a point where nobody cares about the rest of it anymore.

The approach you've outlined is to have the GC tell you when to do the mutation, but why is that a good idea? You can do it deterministically in getLineage().

Maybe the concepts here would be clearer if we limited the graph to a single linked list. Then it looks a lot like a stream, in the functional reactive programming sense. Let the user (in this case, the renderer) buffer the diffs as needed; it knows when to reset the list. And no need for fancy data structures: it could just be an Array.

On Wed, Nov 4, 2015 at 10:09 AM, Jussi Kalliokoski
<jussi.kalliokoski at gmail.com> wrote:
> I'm trying to come up with a solution to the problem of rendering lists [...]
> My idea for a solution is that the lists are immutable, contain a reference
> to their parent and a changeset / diff compared to their parent. [...]

Good problem, interesting idea.

> The biggest problem is that this will leak memory like crazy; every revision
> of the list will be preserved.

OK. Perhaps obviously, the only way around this is to mutate the list,
breaking the chain at a point where nobody cares about the rest of it
anymore.

The approach you've outlined is to have the GC tell you when to do the
mutation, but why is that a good idea? You can do it deterministically
in getLineage().

Maybe the concepts here would be clearer if we limited the graph to a
single linked list. Then it looks a lot like a stream, in the
functional reactive programming sense. Let the user (in this case, the
renderer) buffer the diffs as needed; it knows when to reset the list.
And no need for fancy data structures: it could just be an Array.

-j

# Steve Fink (10 years ago)

On 11/04/2015 08:09 AM, Jussi Kalliokoski wrote:

It provides the needed interface and the unused child revisions get cleaned up properly. However:

This is a complete nightmare for GC performance because of cyclical weak references.

Not necessarily. Current Spidermonkey should handle it ok, using a linear-time algorithm for weakmap marking. (I think an inverted representation would also be good for this? I haven't thought it through.)

Any reference to a child will maintain references to all its parents.

However this doesn't necessarily need to be the case because the stored ancestry is not observable to anything that creates a WeakGraph, except to the oldest ancestor that has a reference elsewhere.

I'm not sure if this use case alone warrants adding a new feature to the language, or if I'm just missing something and it can be implemented with existing constructs or if there should be some other lower level primitive that would allow building a WeakGraph on the user level.

The brute-force approach would be to give each node its own weakmap, and add an entry for every ancestor:

this.getLineage() = function (node, ancestor) {
   const lineage = [];
   while (node) {
     lineage.push(node);
     if (node == ancestor)
       return lineage;
     node = node.parentForAncestor.get(ancestor);
   }
}

this.addNode = function (node, parent, ancestor) {
   node.parentForAncestor = new WeakMap();
   for (let key of this.getLineage(parent, ancestor))
     node.parentForAncestor.set(key, parent);
};

...but notice that addNode now requires an ancestor to be specified, and it will only work as far back as that. And of course, this defeats the whole point of the exercise, in that it requires space quadratic in the number of live nodes. And insertion is linear in the distance from the node to the chosen ancestor. Which could still be a win in rare cases, I guess, but my point is only to show that the leak could be "fixed".

I should note that in your example, you have a WeakMap with basically infinite lifetime. That means that keys always hold values live, in which case you might as well store them directly as properties on the key (node.parent = parent).

I also pondered what language primitives would fix this case. The straightforward approach is to expose something tailored specifically to this use case -- call it a WeakList. You cannot look up elements by index. You can call WeakList.prototype.get(begin, end) and it will return an Array of elements from begin..end (where 'begin' and 'end' are actual elements of the list), or undefined if either begin or end is not present in the list. Internally, the implementation would allowed to discard all leading and trailing dead elements. It would be stored as a dag to share space with overlapping lists.

A totally different option is something I'm calling a VagueMap. I should mention here that I don't know of any prior art, nor have I looked for any, so my apologies if this is known by another name. A VagueMap is sort of the dual of a WeakMap, where the value keeps the entry (and key) alive rather than the key keeping the value alive. But to avoid exposing GC behavior, you cannot just look up values by key (since then if you wanted to know when something gets GC'ed, you'd just stick it in a VagueMap under a well-known key and look it up periodically.) Instead, get(key) returns a "vague value", which can only be used in two ways: for equality testing, and as a VagueMap key. VagueMap lookups would treat vague values and their equivalent non-vague values identically. VagueMap would also support has().

Note that VagueMap does not automatically keep its keys live (as in, only the ones with live values will be kept live.) So you still can't iterate over the keys.

This still doesn't fix the original example. The simple replacement of WeakMap with VagueMap will "work", but getLineage() will return an array of vague values, which aren't much use. So I'll need to add another funky feature to VagueMap: if you give it a (non-vague) value, it will hand you back the non-vague, guaranteed live, key. I'll call it VagueMap.prototype.getKey(vkey, value) where vkey is a possibly-vague key that maps to value.

We get back very close to the original code, with an added loop to reify the vague nodes (oh, and I include the ancestor in the lineage -- which means the do/while loop could now easily be a for loop, but I'll stick close to the original):

function WeakGraph () {
     const parentByNode = new VagueMap();

     this.getLineage = function (node, ancestor) {
         const lineage = [];

         let currentNode = node;
         do {
             lineage.push(currentNode);
             if ( !parentByNode.has(currentNode) ) { throw new 
Error("node is not a descendant of ancestor"); }
             currentNode = parentByNode.get(currentNode);
         } while ( currentNode !== ancestor );
         lineage.push(ancestor);

         for (let i = lineage.length; i > 1; i--) {
             lineage[i-2] = parentByNode.getKey(lineage[i-2], lineage[i-1]);
         }

         return lineage;
     };

     this.addNode = function (node, parent) {
         parentByNode.set(node, parent);
     };
}

Given the number of failed attempts I've taken at this today, there's probably some very obvious problem with all of this. But I think this all checks out now?

On 11/04/2015 08:09 AM, Jussi Kalliokoski wrote:
> It provides the needed interface and the unused child revisions get 
> cleaned up properly. However:
>
> * This is a complete nightmare for GC performance because of cyclical 
> weak references.

Not necessarily. Current Spidermonkey should handle it ok, using a 
linear-time algorithm for weakmap marking. (I think an inverted 
representation would also be good for this? I haven't thought it through.)

> * Any reference to a child will maintain references to all its parents.
>
> However this doesn't necessarily need to be the case because the 
> stored ancestry is not observable to anything that creates a 
> WeakGraph, except to the oldest ancestor that has a reference elsewhere.
>
> I'm not sure if this use case alone warrants adding a new feature to 
> the language, or if I'm just missing something and it can be 
> implemented with existing constructs or if there should be some other 
> lower level primitive that would allow building a WeakGraph on the 
> user level.

The brute-force approach would be to give each node its own weakmap, and 
add an entry for *every* ancestor:

```JS
this.getLineage() = function (node, ancestor) {
   const lineage = [];
   while (node) {
     lineage.push(node);
     if (node == ancestor)
       return lineage;
     node = node.parentForAncestor.get(ancestor);
   }
}

this.addNode = function (node, parent, ancestor) {
   node.parentForAncestor = new WeakMap();
   for (let key of this.getLineage(parent, ancestor))
     node.parentForAncestor.set(key, parent);
};
```

...but notice that addNode now requires an ancestor to be specified, and 
it will only work as far back as that. And of course, this defeats the 
whole point of the exercise, in that it requires space quadratic in the 
number of live nodes. And insertion is linear in the distance from the 
node to the chosen ancestor. Which could still be a win in rare cases, I 
guess, but my point is only to show that the leak could be "fixed".

I should note that in your example, you have a WeakMap with basically 
infinite lifetime. That means that keys always hold values live, in 
which case you might as well store them directly as properties on the 
key (node.parent = parent).

I also pondered what language primitives would fix this case. The 
straightforward approach is to expose something tailored specifically to 
this use case -- call it a WeakList. You cannot look up elements by 
index. You can call WeakList.prototype.get(begin, end) and it will 
return an Array of elements from begin..end (where 'begin' and 'end' are 
actual elements of the list), or undefined if either begin or end is not 
present in the list. Internally, the implementation would allowed to 
discard all leading and trailing dead elements. It would be stored as a 
dag to share space with overlapping lists.

A totally different option is something I'm calling a VagueMap. I should 
mention here that I don't know of any prior art, nor have I looked for 
any, so my apologies if this is known by another name. A VagueMap is 
sort of the dual of a WeakMap, where the value keeps the entry (and key) 
alive rather than the key keeping the value alive. But to avoid exposing 
GC behavior, you cannot just look up values by key (since then if you 
wanted to know when something gets GC'ed, you'd just stick it in a 
VagueMap under a well-known key and look it up periodically.) Instead, 
get(key) returns a "vague value", which can only be used in two ways: 
for equality testing, and as a VagueMap key. VagueMap lookups would 
treat vague values and their equivalent non-vague values identically. 
VagueMap would also support has().

Note that VagueMap does not automatically keep its keys live (as in, 
only the ones with live values will be kept live.) So you still can't 
iterate over the keys.

This still doesn't fix the original example. The simple replacement of 
WeakMap with VagueMap will "work", but getLineage() will return an array 
of vague values, which aren't much use. So I'll need to add another 
funky feature to VagueMap: if you give it a (non-vague) value, it will 
hand you back the non-vague, guaranteed live, key. I'll call it 
VagueMap.prototype.getKey(vkey, value) where vkey is a possibly-vague 
key that maps to value.

We get back very close to the original code, with an added loop to reify 
the vague nodes (oh, and I include the ancestor in the lineage -- which 
means the do/while loop could now easily be a for loop, but I'll stick 
close to the original):

```JS
function WeakGraph () {
     const parentByNode = new VagueMap();

     this.getLineage = function (node, ancestor) {
         const lineage = [];

         let currentNode = node;
         do {
             lineage.push(currentNode);
             if ( !parentByNode.has(currentNode) ) { throw new 
Error("node is not a descendant of ancestor"); }
             currentNode = parentByNode.get(currentNode);
         } while ( currentNode !== ancestor );
         lineage.push(ancestor);

         for (let i = lineage.length; i > 1; i--) {
             lineage[i-2] = parentByNode.getKey(lineage[i-2], lineage[i-1]);
         }

         return lineage;
     };

     this.addNode = function (node, parent) {
         parentByNode.set(node, parent);
     };
}
```

Given the number of failed attempts I've taken at this today, there's 
probably some very obvious problem with all of this. But I *think* this 
all checks out now?

# Jussi Kalliokoski (10 years ago)

On Fri, 6 Nov 2015 at 17:31 Jason Orendorff <jason.orendorff at gmail.com>

wrote:

On Wed, Nov 4, 2015 at 10:09 AM, Jussi Kalliokoski <jussi.kalliokoski at gmail.com> wrote:

I'm trying to come up with a solution to the problem of rendering lists [...] My idea for a solution is that the lists are immutable, contain a reference to their parent and a changeset / diff compared to their parent. [...]

Good problem, interesting idea.

The biggest problem is that this will leak memory like crazy; every revision of the list will be preserved.

OK. Perhaps obviously, the only way around this is to mutate the list, breaking the chain at a point where nobody cares about the rest of it anymore.

The approach you've outlined is to have the GC tell you when to do the mutation, but why is that a good idea? You can do it deterministically in getLineage().

I'm not sure I follow.

Maybe the concepts here would be clearer if we limited the graph to a single linked list.

Perhaps graph as a concept is too wide for expressing this, but it surely is not a linked list either. It may look like one in some cases though, when there's only one lineage and no branching. However that is not the use case here as when application state gets transformed to a view representation it may have various transformations applied to it, such as sorting, mapping or filtering.

Then it looks a lot like a stream, in the functional reactive programming sense. Let the user (in this case, the renderer) buffer the diffs as needed; it knows when to reset the list. And no need for fancy data structures: it could just be an Array.

That misses the point, to say the least. The idea of React is that you can represent the UI as a declarative function of a snapshot of the state, so the view doesn't have to care about async. With FRP you subscribe/unsubscribe to asynchronous streams which, not unlike the idea here, can be transformed just like normal data structures and forked (to fork an immutable data structure is to just pass the reference). The difference is that streams are an inherently async structure, while what I'm trying to do is not. The idea being not only that the easy case of insert without transforms is O(1), but almost every use case can be further better optimized by knowing the previous state of the data structure.

Consider this: You have a large list of items as an array, unsorted, as your state. The view is a paged listing of the items sorted by different criteria. So basically:

list
  .sort(ascendingBy("something"))
  .slice(FIRST_INDEX_ON_PAGE, LAST_INDEX_ON_PAGE)
  .map(item => <Item item={item} />)

Now something gets added to the middle of the list. Let's look at what we can do with this information at each stage, if we know the previous list and the diff to that:

We can perform the sort in linear time because we just have to find where the added item belongs in the list.
The slice now has a different diff (only the insert position is different), and

if the added item is in view, we can make the insertion index relative to our view and add a remove for the last item in the list.
if the added item is before the view, we return an insert for the item coming to view and a remove for the item leaving the view.
if the added item is after the view, the diff is empty, so we can stop here.

We can map just the diffs at the last stage.

You can implement this with streams, but that will just be an unnecessary abstraction level offering no simplification whatsoever while making the concept needlessly async. Another significant difference between this and FRP is that streams require imperative subscribe / unsubscribe, which is basically just sophisticated reference counting, while having the same issues (user after free -> update after unmount, leaks). What I have in

mind can also be implemented using reference counting, and in fact will be to in my initial version, but having a WeakGraph data structure would make this nasty artifact and source of easy bugs (use after free -> use after

unsubscribe, leaks) go away, just like WeakMap and WeakSet are designed to do for certain other cases.

On Fri, 6 Nov 2015 at 17:31 Jason Orendorff <jason.orendorff at gmail.com>
wrote:

> On Wed, Nov 4, 2015 at 10:09 AM, Jussi Kalliokoski
> <jussi.kalliokoski at gmail.com> wrote:
> > I'm trying to come up with a solution to the problem of rendering lists
> [...]
> > My idea for a solution is that the lists are immutable, contain a
> reference
> > to their parent and a changeset / diff compared to their parent. [...]
>
> Good problem, interesting idea.
>
> > The biggest problem is that this will leak memory like crazy; every
> revision
> > of the list will be preserved.
>
> OK. Perhaps obviously, the only way around this is to mutate the list,
> breaking the chain at a point where nobody cares about the rest of it
> anymore.
>
> The approach you've outlined is to have the GC tell you when to do the
> mutation, but why is that a good idea? You can do it deterministically
> in getLineage().
>

I'm not sure I follow.

> Maybe the concepts here would be clearer if we limited the graph to a
> single linked list.

Perhaps graph as a concept is too wide for expressing this, but it surely
is not a linked list either. It may look like one in some cases though,
when there's only one lineage and no branching. However that is not the use
case here as when application state gets transformed to a view
representation it may have various transformations applied to it, such as
sorting, mapping or filtering.

> Then it looks a lot like a stream, in the
> functional reactive programming sense. Let the user (in this case, the
> renderer) buffer the diffs as needed; it knows when to reset the list.
> And no need for fancy data structures: it could just be an Array.
>

That misses the point, to say the least. The idea of React is that you can
represent the UI as a declarative function of a snapshot of the state, so
the view doesn't have to care about async. With FRP you
subscribe/unsubscribe to asynchronous streams which, not unlike the idea
here, can be transformed just like normal data structures and forked (to
fork an immutable data structure is to just pass the reference). The
difference is that streams are an inherently async structure, while what
I'm trying to do is not. The idea being not only that the easy case of
insert without transforms is O(1), but almost every use case can be further
better optimized by knowing the previous state of the data structure.

Consider this: You have a large list of items as an array, unsorted, as
your state. The view is a paged listing of the items sorted by different
criteria. So basically:

```JS
list
  .sort(ascendingBy("something"))
  .slice(FIRST_INDEX_ON_PAGE, LAST_INDEX_ON_PAGE)
  .map(item => <Item item={item} />)
```
Now something gets added to the middle of the list. Let's look at what we
can do with this information at each stage, if we know the previous list
and the diff to that:
* We can perform the sort in linear time because we just have to find where
the added item belongs in the list.
* The slice now has a different diff (only the insert position is
different), and
 - if the added item is in view, we can make the insertion index relative
to our view and add a remove for the last item in the list.
 - if the added item is before the view, we return an insert for the item
coming to view and a remove for the item leaving the view.
 - if the added item is after the view, the diff is empty, so we can stop
here.
* We can map just the diffs at the last stage.

You can implement this with streams, but that will just be an unnecessary
abstraction level offering no simplification whatsoever while making the
concept needlessly async. Another significant difference between this and
FRP is that streams require imperative subscribe / unsubscribe, which is
basically just sophisticated reference counting, while having the same
issues (user after free -> update after unmount, leaks). What I have in
mind can also be implemented using reference counting, and in fact will be
to in my initial version, but having a WeakGraph data structure would make
this nasty artifact and source of easy bugs (use after free -> use after
unsubscribe, leaks) go away, just like WeakMap and WeakSet are designed to
do for certain other cases.

- Jussi

>
> -j
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151108/609f245c/attachment.html>

# Nelo Mitranim (10 years ago)

I was just recently working on the same problem as Jussy: subscribe/unsubscribe pattern for view components, and immutable data tree for efficient updates.

(Working demo here; the subs/unsubs are completely implicit.)

Concluded that weakrefs offer no help with this whatsoever. In fact, they're a red herring. You always want to unsubscribe your listeners deterministically, exactly when the corresponding view is destroyed. Letting them hang around and attempt to update nonexistent views until the next GC phase is no good. In fact, React will give you warnings when you try to update unmounted components. It also provides the "will mount" / "will unmount" lifecycle methods to clean things up.

Pretty sure weakrefs are harmful rather than helpful in this particular use case. But I may have missed the point of the sentiment.

I was just recently working on the same problem as Jussy: subscribe/unsubscribe pattern for view components, and immutable data tree for efficient updates.

(Working demo [here](https://github.com/Mitranim/chat); the subs/unsubs are completely implicit.)

Concluded that weakrefs offer no help with this whatsoever. In fact, they're a red herring. You always want to unsubscribe your listeners deterministically, exactly when the corresponding view is destroyed. Letting them hang around and attempt to update nonexistent views until the next GC phase is no good. In fact, React will give you warnings when you try to update unmounted components. It also provides the "will mount" / "will unmount" lifecycle methods to clean things up.

Pretty sure weakrefs are harmful rather than helpful in this particular use case. But I may have missed the point of the sentiment.

# Jussi Kalliokoski (10 years ago)

On Sun, 8 Nov 2015 at 13:29 Nelo Mitranim <me at mitranim.com> wrote:

I was just recently working on the same problem as Jussy: subscribe/unsubscribe pattern for view components, and immutable data tree for efficient updates.

(Working demo here; the subs/unsubs are completely implicit.)

Concluded that weakrefs offer no help with this whatsoever. In fact, they're a red herring. You always want to unsubscribe your listeners deterministically, exactly when the corresponding view is destroyed. Letting them hang around and attempt to update nonexistent views until the next GC phase is no good. In fact, React will give you warnings when you try to update unmounted components. It also provides the "will mount" / "will unmount" lifecycle methods to clean things up.

Pretty sure weakrefs are harmful rather than helpful in this particular use case. But I may have missed the point of the sentiment.

I think I may have miscommunicated my meaning here. In the idea I've been working on, the view has no subscription (except at top level) that can push data to it, so there's nothing to unsubscribe from. The idea being that the data structures are modeled purely synchronously (current version

diff to previous version). The reason for needing reference counting is different from FRP streams: making sure resources can be collected instead of leaving an infinite trail. So basically FRP in this comparison is a push model that needs reference counting to free the leaf nodes whereas the model I have in mind is pull where the reference counting is needed to free the root nodes.

Jussi

P.S. Thanks for the demo, will check it out!

On Sun, 8 Nov 2015 at 13:29 Nelo Mitranim <me at mitranim.com> wrote:

>
> I was just recently working on the same problem as Jussy:
> subscribe/unsubscribe pattern for view components, and immutable data tree
> for efficient updates.
>
> (Working demo [here](https://github.com/Mitranim/chat); the subs/unsubs
> are completely implicit.)
>
> Concluded that weakrefs offer no help with this whatsoever. In fact,
> they're a red herring. You always want to unsubscribe your listeners
> deterministically, exactly when the corresponding view is destroyed.
> Letting them hang around and attempt to update nonexistent views until the
> next GC phase is no good. In fact, React will give you warnings when you
> try to update unmounted components. It also provides the "will mount" /
> "will unmount" lifecycle methods to clean things up.
>
> Pretty sure weakrefs are harmful rather than helpful in this particular
> use case. But I may have missed the point of the sentiment.
>

I think I may have miscommunicated my meaning here. In the idea I've been
working on, the view has no subscription (except at top level) that can
push data to it, so there's nothing to unsubscribe from. The idea being
that the data structures are modeled purely synchronously (current version
+ diff to previous version). The reason for needing reference counting is
different from FRP streams: making sure resources can be collected instead
of leaving an infinite trail. So basically FRP in this comparison is a push
model that needs reference counting to free the leaf nodes whereas the
model I have in mind is pull where the reference counting is needed to free
the root nodes.

- Jussi


P.S. Thanks for the demo, will check it out!


> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20151108/3ccae88e/attachment-0001.html>

# Jason Orendorff (10 years ago)

On Sun, Nov 8, 2015 at 4:45 AM, Jussi Kalliokoski <jussi.kalliokoski at gmail.com> wrote:

Perhaps graph as a concept is too wide for expressing this, but it surely is not a linked list either. It may look like one in some cases though, when there's only one lineage and no branching. However that is not the use case here as when application state gets transformed to a view representation it may have various transformations applied to it, such as sorting, mapping or filtering.

The last sentence here doesn't seem connected to the rest. A linear list of versions can go through a data transformation pipeline of sorts, maps, and filters just fine.

I don't see where lineage or branching comes into any of your examples. I can see how it could, but I think I must be missing something here.

On Fri, 6 Nov 2015 at 17:31 Jason Orendorff <jason.orendorff at gmail.com> wrote:

Then it looks a lot like a stream, in the functional reactive programming sense. Let the user (in this case, the renderer) buffer the diffs as needed; it knows when to reset the list. And no need for fancy data structures: it could just be an Array.

That misses the point, to say the least. The idea of React is that you can represent the UI as a declarative function of a snapshot of the state, so the view doesn't have to care about async. With FRP you subscribe/unsubscribe to asynchronous streams which, not unlike the idea here, can be transformed just like normal data structures and forked (to fork an immutable data structure is to just pass the reference). The difference is that streams are an inherently async structure, [...]

FRP is not inherently async. As a programming model, I'd say it's inherently about user code not having to care about time.

Anyway, my point here is not "you should drop what you're doing and do FRP instead" but rather "if FRP can handle this without new GC primitives, maybe you can too".

while what I'm trying to do is not. The idea being not only that the easy case of insert without transforms is O(1), but almost every use case can be further better optimized by knowing the previous state of the data structure.

Right, the state is a left fold over a series of events. Like in FRP.

You can implement this with streams, but that will just be an unnecessary abstraction level offering no simplification whatsoever while making the concept needlessly async.

I dunno, you sort of lost me I guess. Streams come to mind because of what you're doing: taking a series of events, transforming them, then applying them one by one, as they arrive, to a mutable data sink.

Another significant difference between this and FRP is that streams require imperative subscribe / unsubscribe, which is basically just sophisticated reference counting, while having the same issues (user after free -> update after unmount, leaks).

This is true, and I think it's your strongest point. Furthermore in support of your side here, even without WeakGraph, I think leaks in FRP are more wasteful than they would be in your model, because in FRP the "deltas" are pushed through the system until somebody turns them off, rather than pulled on demand.

Three alternative models:

In Elm, the data flow graph is static. The system manages subscriptions. Drawback: the user can't mutate the graph procedurally.
A system could allow the data flow graph to change, not procedurally exactly, but when a diff is applied that changes the UI. Since the system knows about diffs, it could then automatically manage subscriptions based on what's actually in the UI. Subscribe on insert, unsubscribe on delete. No manual subscription management for the end user; within the framework, you can use reference counting etc.
Implement your system using a "WeakGraph"-like object that keeps strong references to all revisions in a fixed-sized window of recent history. Periodically clear old patches and revisions from this cache. Instead of getLineage(a, b), we have a method diff(a, b) that typically does exactly the same thing, returning an array of patches. But diff may find that the user is trying to diff two versions that are not both in the history. Then it simply does a full diff of the two states, effectively generating a fake lineage. Note that the performance is not amortized constant time: diff() is always fast except when diffing something quite old, which shouldn't happen. And the memory usage of the differ is more predictable than "WeakGraph". Drawback: fake lineages. But so what?

On Sun, Nov 8, 2015 at 4:45 AM, Jussi Kalliokoski
<jussi.kalliokoski at gmail.com> wrote:
> Perhaps graph as a concept is too wide for expressing this, but it surely
> is not a linked list either. It may look like one in some cases though,
> when there's only one lineage and no branching. However that is not
> the use case here as when application state gets transformed to a view
> representation it may have various transformations applied to it, such as
> sorting, mapping or filtering.

The last sentence here doesn't seem connected to the rest. A linear
list of versions can go through a data transformation pipeline of
sorts, maps, and filters just fine.

I don't see where lineage or branching comes into any of your
examples. I can see how it could, but I think I must be missing
something here.

> On Fri, 6 Nov 2015 at 17:31 Jason Orendorff <jason.orendorff at gmail.com>
> wrote:
>> Then it looks a lot like a stream, in the
>> functional reactive programming sense. Let the user (in this case, the
>> renderer) buffer the diffs as needed; it knows when to reset the list.
>> And no need for fancy data structures: it could just be an Array.
>
> That misses the point, to say the least. The idea of React is that you can
> represent the UI as a declarative function of a snapshot of the state, so
> the view doesn't have to care about async. With FRP you
> subscribe/unsubscribe to asynchronous streams which, not unlike the idea
> here, can be transformed just like normal data structures and forked (to
> fork an immutable data structure is to just pass the reference). The
> difference is that streams are an inherently async structure, [...]

FRP is not inherently async. As a programming model, I'd say it's
inherently about user code not having to care about time.

Anyway, my point here is not "you should drop what you're doing and do
FRP instead" but rather "if FRP can handle this without new GC
primitives, maybe you can too".

> while what I'm
> trying to do is not. The idea being not only that the easy case of insert
> without transforms is O(1), but almost every use case can be further better
> optimized by knowing the previous state of the data structure.

Right, the state is a left fold over a series of events. Like in FRP.

> You can implement this with streams, but that will just be an unnecessary
> abstraction level offering no simplification whatsoever while making the
> concept needlessly async.

I dunno, you sort of lost me I guess. Streams come to mind because of
what you're doing: taking a series of events, transforming them, then
applying them one by one, as they arrive, to a mutable data sink.

> Another significant difference between this and
> FRP is that streams require imperative subscribe / unsubscribe, which is
> basically just sophisticated reference counting, while having the same
> issues (user after free -> update after unmount, leaks).

This is true, and I think it's your strongest point. Furthermore in
support of your side here, even without WeakGraph, I think leaks in
FRP are more wasteful than they would be in your model, because in FRP
the "deltas" are pushed through the system until somebody turns them
off, rather than pulled on demand.

Three alternative models:

1.  In Elm, the data flow graph is static. The system manages
subscriptions. Drawback: the user can't mutate the graph procedurally.

2.  A system could allow the data flow graph to change, not
*procedurally* exactly, but when a diff is applied that changes the
UI. Since the system knows about diffs, it could then automatically
manage subscriptions based on what's actually in the UI. Subscribe on
insert, unsubscribe on delete. No manual subscription management for
the end user; within the framework, you can use reference counting
etc.

3.  Implement your system using a "WeakGraph"-like object that keeps
strong references to all revisions in a fixed-sized window of recent
history. Periodically clear old patches and revisions from this cache.
Instead of `getLineage(a, b)`, we have a method `diff(a, b)` that
typically does exactly the same thing, returning an array of patches.
But `diff` may find that the user is trying to diff two versions that
are not both in the history. Then it simply does a full diff of the
two states, effectively generating a fake lineage. Note that the
performance is not *amortized* constant time: `diff()` is *always*
fast except when diffing something quite old, which shouldn't happen.
And the memory usage of the differ is more predictable than
"WeakGraph". Drawback: fake lineages. But so what?

-j