Array.prototype.remove(item)

# Isiah Meadows (6 years ago)

I find myself frequently using this code to remove an item from an array:

var index = array.indexOf(item)
if (index >= 0) array.splice(index, 1)

That's nice, but there's a couple problems:

  1. It only removes the first element, not all occurrences in the array (which is usually what you want).
  2. I know of no engine that fuses those calls, and it'd be rather difficult to do so.

My proposal is this: add a new Array.prototype.remove(item) method that basically does this:

  • Accept a list of items to remove, passed as arguments.
  • Remove all instances of those items from the array.
  • Return true if any entries were removed, false otherwise.

And in the process, it should take care to do it in an optimized fashion, avoiding the overhead of allocation, excess reads/writes, and excess branching.

In case you're curious why I'm not using Set, here's why that's ill-equipped in many cases:

  • Sets take up a lot more memory than arrays, by necessity.
  • Sets are stored typically in tree-like structures, making for poor iteration performance.
  • If you're frequently iterating, but infrequently adding/removing, these two will quickly kill your performance.

It rarely comes up in day-to-day coding, but the difference is substantial when you're writing lower-level data structures and performance-sensitive code. It's probably down a similar vein to Array.prototype.copyWithin, which was pretty much designed for memcpy or memmove.

Here would be a polyfill of what I'd like to propose (it uses a minimum of writes, by design, and it does everything in one pass):

if (typeof Array.prototype.remove !== "function") {
    // Private utility
    const includes = (items, itemLength, entry) => {
        for (let j = 0; j < itemLength; j++) {
            if (items[j] === entry) return true
        }

        return false
    }

    Array.prototype.remove = function (item) {
        const oldLength = this.length
        const newLength = arguments.length
        // This is technically observably no different than
        const items = [...arguments]
        let newLength = 0

        for (let i = 0; i < oldLength; i++) {
            const entry = this[i]
            if (includes(items, newLength, entry)) {
                let newLength = i++

                while (i !== list.length) {
                    const entry = this[i]
                    if (!includes(items, newLength, entry)) {
                        this[newLength++] = entry
                    }
                    i++
                }

                this.length = newLength
                for (let i = newLength; i < oldLength; i++) delete this[i]
                return true
            }
        }

        return false
    }
}

If the variadic version wasn't accepted, there's also the non-variadic version:

if (typeof Array.prototype.remove !== "function") {
    Array.prototype.remove = function (item) {
        const oldLength = this.length
        let newLength = 0

        for (let i = 0; i < oldLength; i++) {
            const entry = this[i]
            if (entry === item) {
                let newLength = i++

                while (i !== list.length) {
                    const entry = this[i]
                    if (entry !== item) this[newLength++] = entry
                    i++
                }

                this.length = newLength
                for (let i = newLength; i < oldLength; i++) delete this[i]
                return true
            }
        }

        return false
    }
}

Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# T.J. Crowder (6 years ago)

On Fri, Nov 10, 2017 at 10:38 AM, Isiah Meadows <isiahmeadows at gmail.com>

wrote:

My proposal is this: add a new Array.prototype.remove(item) method...

So not just a "would be nice," but you've provided a concrete reason for putting it in the standard lib: Providing an optimized version vs. what can be done in userland. Excellent. :-)

In a similar vein, is there a specific reason you're modifying the array in place vs. creating a new one via filter? Memory churn? Callback cost (though that's amazingly low in modern implementations as I'm sure you know)? Lots of independent references to it? Are any/all things you've had a real-world problem with around this? (That's a question, not veiled criticism.)

FWIW, if it removes all occurrences of the items, I'd call it removeAll. For some reason -- I'm not exactly sure why -- I'd expect remove to remove only the first. Perhaps because indexOf/findIndex finds only the first index, find finds only the first match, etc.? And it leaves the door open for a remove that doesn't remove all (though that seems hard to justify if a minimalist standard lib prevails). Web safety of the name would need checking in any case. (Quick check says neither MooTools nor Prototype adds either remove or removeAll to Array.prototype.)

-- T.J. Crowder

# Bob Myers (6 years ago)

What's wrong with this?

function removeFromArray(array, pred) {
  let changed = false;
  let j = 0;

  for (elt of array) {
    if (pred(elt)) changed = true;
    else array[j++] = elt;
  }

  array.length = j;
  return changed;
}

Bob

# T.J. Crowder (6 years ago)

On Fri, Nov 10, 2017 at 11:41 AM, Bob Myers <rtm at gol.com> wrote:

What's wrong with this?

I had the impression he was trying to avoid callbacks, just using ===. But other than a missing const on the for-of, it looks nice and efficient -- except that it doesn't seem like for-of on arrays with the default iterator is much optimized yet. FWIW:

function removeFromArray(array, item) {
    let changed = false;
    let j, i, len, elt;

    for (j = i = 0, len = array.length; i < len; ++i) {
        elt = array[i];
        if (elt === item) {
            changed = true;
        } else {
            array[j++] = elt;
        }
    }

    array.length = j;
    return changed;
}

Clunkier but apparently we're optimizing for speed...

-- T.J. Crowder

# Bob Myers (6 years ago)

Thanks for your optimization. In one of my library routines I further optimize this with

  if (elt === item) {
    changed = true;
  } else {
    if (changed) { array[j] = elt; }
    j++;
  }

To avoid unnecessary assignments (which might be expensive--I don't know, are they?) while you're still in the portion of the array before the first element to be removed.

# Isiah Meadows (6 years ago)

Yeah...I probably should've named it removeAll like I did in my latest use case, although I still feel it's a good idea to add a simple remove to replace the common splice + indexOf idiom, removing only the first to match.

Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# Isiah Meadows (6 years ago)

Also, most generic Array methods are specialized to array-like objects, not iterables. My proposed semantics go along with this.


Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# Isiah Meadows (6 years ago)

My proposed semantics does similar, except I just read until it matches, then I skip it and incrementally copy the rest in a separate loop (avoiding an extra branch). I delete the remaining parts in the polyfill, but you wouldn't need to do that with normal arrays (just non-array array-likes).

As for performance, branch prediction matters a lot more than local variable assignment, and sequential array iteration usually hits the CPU cache line in any JIT.

Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# Isiah Meadows (6 years ago)

Inline. (Hindsight, should've included this in the previous email)

On Fri, Nov 10, 2017 at 5:57 AM, T.J. Crowder <tj.crowder at farsightsoftware.com> wrote:

On Fri, Nov 10, 2017 at 10:38 AM, Isiah Meadows <isiahmeadows at gmail.com> wrote:

My proposal is this: add a new Array.prototype.remove(item) method...

So not just a "would be nice," but you've provided a concrete reason for putting it in the standard lib: Providing an optimized version vs. what can be done in userland. Excellent. :-)

Oh, and engines could totally use SIMD vectorization in this routine for dense, homogenous integer, float, or non-numeric arrays, and they could just emit raw machine code for it. In the case of a first-item remove with integers or objects, you could search N at a time until you find the first occurrence, then from there, do an overlapping memmove (which is a good candidate for raw assembly optimization), and then just chop the last entry off. In the case of a multi-item removeAll, you could just do that first step, search N at a time, repeatedly.

In a similar vein, is there a specific reason you're modifying the array in place vs. creating a new one via filter? Memory churn? Callback cost (though that's amazingly low in modern implementations as I'm sure you know)? Lots of independent references to it? Are any/all things you've had a real-world problem with around this? (That's a question, not veiled criticism.)

As stated before, it's in lower-level, more performance-sensitive code. In these cases, memory churn is a very real thing you have to contend with, and in some cases, the allocation of an array or object is itself a potential problem. In nearly every virtual DOM library, in most , the core involves

If you want to see a case where simply fixing type checking and reducing event handler allocation (including multiple C++ round-trips) basically doubled performance in a particular common case in a benchmark, check this diff out:

MithrilJS/mithril.js/pull/1949/files#diff-000bdfae56251b0715111d2b18c9de3cL579

If you want to see a case where every branch, every function call has to be carefully monitored, check this out:

MithrilJS/mithril.js/blob/next/render/render.js#L164-L255

If you want to see another case where I've run into issues with branching and string allocation (in the former) - strings aren't as cheap as you might expect in a number-heavy context - check this out (pardon the TypeScript):

isiahmeadows/enigma/blob/master/src/scanner/string.ts#L240-L282, isiahmeadows/enigma/blob/master/src/scanner/seek.ts

It happens, just not on a regular basis with application-level code. Kind of like Array.prototype.copyWithin - it's a case where in theory, you could do it in-language, but the engine could do it immensely faster.

FWIW, if it removes all occurrences of the items, I'd call it removeAll. For some reason -- I'm not exactly sure why -- I'd expect remove to remove only the first. Perhaps because indexOf/findIndex finds only the first index, find finds only the first match, etc.? And it leaves the door open for a remove that doesn't remove all (though that seems hard to justify if a minimalist standard lib prevails). Web safety of the name would need checking in any case. (Quick check says neither MooTools nor Prototype adds either remove or removeAll to Array.prototype.)

-- T.J. Crowder


Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# T.J. Crowder (6 years ago)

On Fri, Nov 10, 2017 at 2:17 PM, Isiah Meadows <isiahmeadows at gmail.com> wrote:

Inline. (Hindsight, should've included this in the previous email)

Good deal, that's the kind of thing I was thinking would strengthen the argument.

I think you may have meant to have more after "In nearly every virtual DOM library, in most , the core involves..." ? (Ignore this if the rest of that sentence wasn't important.)

-- T.J. Crowder

# Isiah Meadows (6 years ago)

Yeah...oops. I meant to say something to tge effect of "in nearly ebery virtual DOM library, the core involves a lot of hot loops and array access, and every line of diff code is critical." I just forgot to properly proofread my reply before sending it. :-(

# Alexander Jones (6 years ago)

Small Sets could maybe be represented without the tree?

And everything you say about SIMD could be done too when the set has a viable homogeneous type.

# Jeremy Martin (6 years ago)

I'm not sure where the "don't break the web" threshold is set at (or how that gets evaluated), but I do see a lot of matches for "Array.prototype.remove" on GitHub, at least:

search?l=JavaScript&q="Array.prototype.remove"&type=Code&utf8=✓

There's also an old blog post by John Resig with an example implementation ( johnresig.com/blog/javascript-array-remove), so I'm not sure if that gave rise to any significant adoption or not.

I like the proposal in general, though - just throwing this out there in case other names should be considered (maybe discard?).

# Man Hoang (6 years ago)

There are cases that you only want to remove the first occurrence of an item (because you know there is at most one occurrence of that item) and there are cases that you want to remove more than one item. Therefore, I propose adding the following methods (written in TypeScript here for static typing support), optimizing for each use case.

class Array<E> {
    /**
     * Removes the first occurrence of [item] from this array.
     *
     * Returns `true` if [item] was in this array and `false` otherwise.
     */
    remove(item: E): boolean {
        const i = this.indexOf(item);
        if (i >= 0) {
            this.splice(i, 1);
            return true;
        }
        return false;
    }

    /**
     * Removes all items that satisfies the predicate [test].
     * @returns The number of removed items.
     */
    removeWhere(test: (item: E) => boolean): number {
        let count = 0;
        for (let i = this.length - 1; i >= 0; i--) {
            if (test(this[i])) {
                this.splice(i, 1);
                count++;
            }
        }
        return count;
    }
}

References

# Jordan Harband (6 years ago)

What's the benefit of adding more array mutator methods? The general idioms of the community have moved towards approaches that create a new object, rather than changing the old one - ie, .filter would be appropriate here.

# Man Hoang (6 years ago)

The benefits are

  • efficient (memory & speed)
  • clear intent
  • concise

There are always use cases where you want to mutate arrays.

How would you rewrite the following deselect method using filter?

export class Selector<T> {
    private _values: T[] = [];

    get values(): ReadonlyArray<T> {
        return this._values;
    }

    /**
     * Removes [value] from the list of selected items.
     *
     * Returns `true` if [value] was previously selected, `false` otherwise.
     */
    deselect(value: T): boolean {
        if (this._values.remove(value)) {
            this.selectionChanged([], [value]);
            return true;
        }
        return false;
    }

    /**
     * Adds [value] to the list of selected items.
     *
     * Returns `true` if [value] was not previously selected, `false` otherwise.
     */
    select(value: T): boolean {
        if (this._values.pushIfAbsent(value)) {
            this.selectionChanged([value], []);
            return true;
        }
        return false;
    }

    protected selectionChanged(addedValues, removedValues) {
        // Do something such as firing an event.
    }
}
# Claude Pache (6 years ago)

For that specific example, I think that a Set is more appropriate:

export class Selector<T> {
    private _set = new Set;
 
    get values(): ReadonlyArray<T> {
        return Array.from(this._set);
        // although it might be better to return an iterator: return this._set.values();
    }

    deselect(value: T): boolean {
        if (this._set.delete(value)) {
            this.selectionChanged([], [value]);
            return true;
        }
        return false;
    }

    // etc.
}

More generally, each time you are tempted to use an add()/remove()/toggle()/contains()/etc() method on an Array, you should ask yourself if it would not be better (more efficient, clearer intent, more concise, ...) to use a Set instead.

# Man Hoang (6 years ago)

The problem with Set is that its add method returns this instead of boolean. If Set.prototype.add returned boolean, I would have used Set.

That’s why in the select method of my sample code, I use a custom defined method named pushIfAbsent. The actual type of _values is not Array but a subclass of Array.

export class MyArray<E> extends Array<E> {
    /** Whether this array is empty. */
    get isEmpty(): boolean {
        return this.length === 0;
    }

    /**
     * Adds [item] to the end of this array if it's not already in this array.
     *
     * Returns `true` is [item] was added, `false` otherwise.
     */
    pushIfAbsent(item: E): boolean {
        if (!this.includes(item)) {
            this.push(item);
            return true;
        }
        return false;
    }

    /**
     * Removes the first occurrence of [item] from this array.
     *
     * Returns `true` if [item] was in this array, `false` otherwise.
     */
    remove(item: E): boolean {
        const i = this.indexOf(item);
        if (i >= 0) {
            this.splice(i, 1);
            return true;
        }
        return false;
    }
}
# kai zhu (6 years ago)

i don’t have strong opinion on Array.prototype.remove, but i have strong opinion against your use-case to subclass/extend Array. from a product-development perspective, you're creating unnecessary integration-headaches by having the client <-> server system pass around <MyArray> instances instead of plain JSON arrays.

i remain unconvinced of any real-world benefit to subclassing Array, that justifies the cost-added to already painful task of debugging end-to-end communication-issues between client <-> server. your web-project will have a less-painful integration/qa process, if you stick with plain JSON arrays employing static-functions instead:

/*jslint devel*/
(function () {
    "use strict";
    var myArray1;
    function arrayRemoveItem(array, item) {
    /**
     * This static-function will:
     * Remove the first occurrence of [item] from this array.
     * Return `true` if [item] was in this array, `false` otherwise.
     */
        var i = array.indexOf(item);
        if (i >= 0) {
            array.splice(i, 1);
            return true;
        }
        return false;
    }
    myArray1 = [1, 2, 3, 4]; // [1, 2, 3, 4]
    console.log(arrayRemoveItem(myArray1, 2)); // true
    console.log(myArray1); // [1, 3, 4]
}());

kai zhu kaizhu256 at gmail.com

# Claude Pache (6 years ago)

Le 10 oct. 2018 à 23:17, kai zhu <kaizhu256 at gmail.com> a écrit :

hi Man, i don’t have strong opinion on Array.prototype.remove, but i have strong opinion against your use-case to subclass/extend Array. from a product-development perspective, you're creating unnecessary integration-headaches by having the client <-> server system pass around <MyArray> instances instead of plain JSON arrays.

Not every object in JS is intended to travel between client and server, or to be stored in JSON. And if/when the class want to share some information, it can easily translate its data to and from a JSON-friendly format. (Sorry, I’m not motivated to read the rest of your e-mail.)

# Jordan Harband (6 years ago)

Man: add doesn't need to return a boolean, because it always results in the item being in the collection after the fact. You could subclass Set, and make .add do that, though, if you like! Alternatively, you could use .has prior to calling .add, to get your boolean value.

# kai zhu (6 years ago)

Not every object in JS is intended to travel between client and server, or to be stored in JSON. And if/when the class want to share some information, it can easily translate its data to and from a JSON-friendly format.

hi Claude, agree to disagree. its impossible to predict/design what should and shouldn’t travel between client <-> server in javascript product-development.

the best you can do as an “architect” is to accept that anything of-consequence in javascript will eventually need to "leak" itself to the "client <-> server <-> db JSON/urlstring dance", over the normal-flow of UX-features getting added to a product; and that much of the tech-debt in these projects arise from needless class constructor/serialization hacks to make that dance happen, when it wasn’t expected (and could’ve been avoided by sticking with plain JSON data-structures).

even in esoteric, seemingly non-web-related cases like using duktape in c++: why would anyone want to have embedded js-capability in such scenario? the only use-case that comes to mind is for manipulating JSON data-structures, and again, passing plain JSON-data between "c++ program" <-> “external web-system"

kai zhu kaizhu256 at gmail.com

# Sanford Whiteman (6 years ago)

You need to cite your sources for the claim that "most of the tech debt" in JavaScript product development is due to accidentally using types other than 20-year-old built-ins and having to figure out the daunting task of JSON serialization.

—— Sandy

# Ron Buckton (6 years ago)

That depends entirely on what the return value means. Returning Boolean from add doesn’t mean, “does the value now exist in the set”, but rather means “was the set modified as a result of this operation”.

To avoid any possible performance cost for calling has before add, I usually have to do something like:

function setAdd(set, value) {
  const size = set.size;
  set.add(value);
  return size !== set.size;
}

From: es-discuss <es-discuss-bounces at mozilla.org> On Behalf Of Jordan Harband

Sent: Wednesday, October 10, 2018 7:19 PM To: jolleekin at outlook.com Cc: es-discuss <es-discuss at mozilla.org>

Subject: Re: Array.prototype.remove(item)

Man: add doesn't need to return a boolean, because it always results in the item being in the collection after the fact. You could subclass Set, and make .add do that, though, if you like! Alternatively, you could use .has prior to calling .add, to get your boolean value.

On Wed, Oct 10, 2018 at 1:01 AM Man Hoang <jolleekin at outlook.com<mailto:jolleekin at outlook.com>> wrote:

The problem with Set is that its add method returns this instead of boolean. If Set.prototype.add returned boolean, I would have used Set.

That’s why in the select method of my sample code, I use a custom defined method named pushIfAbsent. The actual type of _values is not Array but a subclass of Array.

export class MyArray<E> extends Array<E> {
    /**
     * Adds [item] to the end of this array if it's not already in this array.
     *
     * Returns `true` is [item] was added, `false` otherwise.
     */
    pushIfAbsent(item: E): boolean {
        if (!this.includes(item)) {
            this.push(item);
            return true;
        }
        return false;
    }

    /**
     * Removes the first occurrence of [item] from this array.
     *
     * Returns `true` if [item] was in this array, `false` otherwise.
     */
    remove(item: E): boolean {
        const i = this.indexOf(item);
        if (i >= 0) {
            this.splice(i, 1);
            return true;
        }
        return false;
    }
}
# kai zhu (6 years ago)

You need to cite your sources

hi Sandy, sure hear are 2 sources:

  1. (80% reduction in tech-debt from 5000 -> 1000 sloc) - commit to remove [class-based] Backbone, Handlebars, and other dependencies from fork of swagger-ui [1]. everything was eventually converted to use pure JSON data-structures, to make client <-> server JSON-serialization as painless as possible.

  2. (90% reduction in tech-debt from 11,000 -> 1000 sloc) - 3 commits to remove class-dependencies from fork of nedb [2] and rely on pure JSON data-structures to allow easier import/export of persistent db-data

[1] github commit to remove class-dependencies from swagger-ui kaizhu256/node-swgg/commit/95d7c357df083cdf80ba595ae22bf76cb90c5a8c, kaizhu256/node-swgg/commit/95d7c357df083cdf80ba595ae22bf76cb90c5a8c

[2] github commits to remove class-dependencies from nedb kaizhu256/node-db-lite/compare/c6fd675ceaf1104d513183b622a3c6227133a1c7...a30e0b67485f16bed813015adbb6349c481b438d, kaizhu256/node-db-lite/compare/c6fd675ceaf1104d513183b622a3c6227133a1c7...a30e0b67485f16bed813015adbb6349c481b438d

kai zhu kaizhu256 at gmail.com

# Sanford Whiteman (6 years ago)

You need to cite your sources

hi Sandy, sure hear are 2 sources:

So you believe these 2 patches prove "most of the tech debt" across all JavaScript product development is due to this factor.

Huh.

Well, to each their own as far as how "proof" works. I prefer the classical definition. You know, the one where you use the actual population you're generalizing about.

# Man Hoang (6 years ago)

With Array.prototype.remove(item)

Code is clean and concise.

if (array.remove(item)) {
    // [item] was removed, do something here.
}

 

Without Array.prototype.remove(item)

Code is verbose or duplicated (everyone has their own version of removeArrayItem) => WASTE.

// Using `indexOf` and `splice` directly.

const i = array.indexOf(item);
if (i >= 0) {
    array.splice(i, 1);
    // [item] was removed, do something here.
}

// Using a helper function.

if (removeArrayItem(array, item)) {
    // [item] was removed, do something here.
}

   

With Set.prototype.add returning a boolean

Code is clean and concise. Together, add and delete are a perfect pair.

if (set.add(item)) {
    // [item] was added, do something here.
}

if (set.delete(item)) {
    // [item] was deleted, do something here.
}

 

With Set.prototype.add returning this (almost, if not always, useless)

Code is verbose or duplicated (everyone has their own version of addItemToSetIfAbsent) => WASTE.

// Using `has` and `add`.

if (!set.has(item)) {
    set.add(item);
    // [item] was added, do something here.
}

// Using `size` and `add`.

const oldSize = set.size;
set.add(item);
if (set.size > oldSize) {
    // [item] was added, do something here.
}

// Using a helper function.

if (addItemToSetIfAbsent(set, item)) {
    // [item] was added, do something here.
}

    My point of view is JS and the web itself were not designed for app development. They have constantly been hacked to do so. If they had been, people wouldn’t have wasted too many resources creating workarounds (not solutions, not technologies) such as jQuery, Ember, Knockout, Polymer, Angular, React, Vue, GWT, CoffeeScript, TypeScript, and so on. No matter how we change JS, its inherent problems remain because we cannot introduce breaking changes. (Please do not argue about this as it will lead to nowhere)

Having said that, let's get back to the main point of this thread and this whole website: proposing ideas to make JS better (or less terrible). For me, changes such as adding Array.prototype.remove(item) and modifying Set.prototype.add to return a boolean may seem small but would have a huge impact when multiplied by the number of usages.

One small improvement repeated many times makes a big improvement.

Many small improvements combined make a big improvement.

# Andrea Giammarchi (6 years ago)

TBH, for consistency sake with everything else in JS, I'd call the method delete, since it's the one that returns true or false in pretty much every other case.

# T.J. Crowder (6 years ago)

On Thu, Oct 11, 2018 at 10:59 AM Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:

TBH, for consistency sake with everything else in JS, I'd call the method delete, since it's the one that returns true or false in pretty much every other case.

E.g., in Maps and Sets (and their weak cousins).

-- T.J. Crowder

# Andrea Giammarchi (6 years ago)

but also delete obj[key] ... not identical as operation, but with a boolean return.

That leaves room for eventual Array#drop(item) that returns the item after optionally mutating the array ^_^;;