Array.prototype.remove(item)

# Isiah Meadows (9 months ago)

I find myself frequently using this code to remove an item from an array:

var index = array.indexOf(item)
if (index >= 0) array.splice(index, 1)

That's nice, but there's a couple problems:

  1. It only removes the first element, not all occurrences in the array (which is usually what you want).
  2. I know of no engine that fuses those calls, and it'd be rather difficult to do so.

My proposal is this: add a new Array.prototype.remove(item) method that basically does this:

  • Accept a list of items to remove, passed as arguments.
  • Remove all instances of those items from the array.
  • Return true if any entries were removed, false otherwise.

And in the process, it should take care to do it in an optimized fashion, avoiding the overhead of allocation, excess reads/writes, and excess branching.

In case you're curious why I'm not using Set, here's why that's ill-equipped in many cases:

  • Sets take up a lot more memory than arrays, by necessity.
  • Sets are stored typically in tree-like structures, making for poor iteration performance.
  • If you're frequently iterating, but infrequently adding/removing, these two will quickly kill your performance.

It rarely comes up in day-to-day coding, but the difference is substantial when you're writing lower-level data structures and performance-sensitive code. It's probably down a similar vein to Array.prototype.copyWithin, which was pretty much designed for memcpy or memmove.

Here would be a polyfill of what I'd like to propose (it uses a minimum of writes, by design, and it does everything in one pass):

if (typeof Array.prototype.remove !== "function") {
    // Private utility
    const includes = (items, itemLength, entry) => {
        for (let j = 0; j < itemLength; j++) {
            if (items[j] === entry) return true
        }

        return false
    }

    Array.prototype.remove = function (item) {
        const oldLength = this.length
        const newLength = arguments.length
        // This is technically observably no different than
        const items = [...arguments]
        let newLength = 0

        for (let i = 0; i < oldLength; i++) {
            const entry = this[i]
            if (includes(items, newLength, entry)) {
                let newLength = i++

                while (i !== list.length) {
                    const entry = this[i]
                    if (!includes(items, newLength, entry)) {
                        this[newLength++] = entry
                    }
                    i++
                }

                this.length = newLength
                for (let i = newLength; i < oldLength; i++) delete this[i]
                return true
            }
        }

        return false
    }
}

If the variadic version wasn't accepted, there's also the non-variadic version:

if (typeof Array.prototype.remove !== "function") {
    Array.prototype.remove = function (item) {
        const oldLength = this.length
        let newLength = 0

        for (let i = 0; i < oldLength; i++) {
            const entry = this[i]
            if (entry === item) {
                let newLength = i++

                while (i !== list.length) {
                    const entry = this[i]
                    if (entry !== item) this[newLength++] = entry
                    i++
                }

                this.length = newLength
                for (let i = newLength; i < oldLength; i++) delete this[i]
                return true
            }
        }

        return false
    }
}

Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# T.J. Crowder (9 months ago)

On Fri, Nov 10, 2017 at 10:38 AM, Isiah Meadows <isiahmeadows at gmail.com>

wrote:

My proposal is this: add a new Array.prototype.remove(item) method...

So not just a "would be nice," but you've provided a concrete reason for putting it in the standard lib: Providing an optimized version vs. what can be done in userland. Excellent. :-)

In a similar vein, is there a specific reason you're modifying the array in place vs. creating a new one via filter? Memory churn? Callback cost (though that's amazingly low in modern implementations as I'm sure you know)? Lots of independent references to it? Are any/all things you've had a real-world problem with around this? (That's a question, not veiled criticism.)

FWIW, if it removes all occurrences of the items, I'd call it removeAll. For some reason -- I'm not exactly sure why -- I'd expect remove to remove only the first. Perhaps because indexOf/findIndex finds only the first index, find finds only the first match, etc.? And it leaves the door open for a remove that doesn't remove all (though that seems hard to justify if a minimalist standard lib prevails). Web safety of the name would need checking in any case. (Quick check says neither MooTools nor Prototype adds either remove or removeAll to Array.prototype.)

-- T.J. Crowder

# Bob Myers (9 months ago)

What's wrong with this?

function removeFromArray(array, pred) {
  let changed = false;
  let j = 0;

  for (elt of array) {
    if (pred(elt)) changed = true;
    else array[j++] = elt;
  }

  array.length = j;
  return changed;
}

Bob

# T.J. Crowder (9 months ago)

On Fri, Nov 10, 2017 at 11:41 AM, Bob Myers <rtm at gol.com> wrote:

What's wrong with this?

I had the impression he was trying to avoid callbacks, just using ===. But other than a missing const on the for-of, it looks nice and efficient -- except that it doesn't seem like for-of on arrays with the default iterator is much optimized yet. FWIW:

function removeFromArray(array, item) {
    let changed = false;
    let j, i, len, elt;

    for (j = i = 0, len = array.length; i < len; ++i) {
        elt = array[i];
        if (elt === item) {
            changed = true;
        } else {
            array[j++] = elt;
        }
    }

    array.length = j;
    return changed;
}

Clunkier but apparently we're optimizing for speed...

-- T.J. Crowder

# Bob Myers (9 months ago)

Thanks for your optimization. In one of my library routines I further optimize this with

  if (elt === item) {
    changed = true;
  } else {
    if (changed) { array[j] = elt; }
    j++;
  }

To avoid unnecessary assignments (which might be expensive--I don't know, are they?) while you're still in the portion of the array before the first element to be removed.

# Isiah Meadows (9 months ago)

Yeah...I probably should've named it removeAll like I did in my latest use case, although I still feel it's a good idea to add a simple remove to replace the common splice + indexOf idiom, removing only the first to match.

Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# Isiah Meadows (9 months ago)

Also, most generic Array methods are specialized to array-like objects, not iterables. My proposed semantics go along with this.


Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# Isiah Meadows (9 months ago)

My proposed semantics does similar, except I just read until it matches, then I skip it and incrementally copy the rest in a separate loop (avoiding an extra branch). I delete the remaining parts in the polyfill, but you wouldn't need to do that with normal arrays (just non-array array-likes).

As for performance, branch prediction matters a lot more than local variable assignment, and sequential array iteration usually hits the CPU cache line in any JIT.

Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# Isiah Meadows (9 months ago)

Inline. (Hindsight, should've included this in the previous email)

On Fri, Nov 10, 2017 at 5:57 AM, T.J. Crowder <tj.crowder at farsightsoftware.com> wrote:

On Fri, Nov 10, 2017 at 10:38 AM, Isiah Meadows <isiahmeadows at gmail.com> wrote:

My proposal is this: add a new Array.prototype.remove(item) method...

So not just a "would be nice," but you've provided a concrete reason for putting it in the standard lib: Providing an optimized version vs. what can be done in userland. Excellent. :-)

Oh, and engines could totally use SIMD vectorization in this routine for dense, homogenous integer, float, or non-numeric arrays, and they could just emit raw machine code for it. In the case of a first-item remove with integers or objects, you could search N at a time until you find the first occurrence, then from there, do an overlapping memmove (which is a good candidate for raw assembly optimization), and then just chop the last entry off. In the case of a multi-item removeAll, you could just do that first step, search N at a time, repeatedly.

In a similar vein, is there a specific reason you're modifying the array in place vs. creating a new one via filter? Memory churn? Callback cost (though that's amazingly low in modern implementations as I'm sure you know)? Lots of independent references to it? Are any/all things you've had a real-world problem with around this? (That's a question, not veiled criticism.)

As stated before, it's in lower-level, more performance-sensitive code. In these cases, memory churn is a very real thing you have to contend with, and in some cases, the allocation of an array or object is itself a potential problem. In nearly every virtual DOM library, in most , the core involves

If you want to see a case where simply fixing type checking and reducing event handler allocation (including multiple C++ round-trips) basically doubled performance in a particular common case in a benchmark, check this diff out:

MithrilJS/mithril.js/pull/1949/files#diff-000bdfae56251b0715111d2b18c9de3cL579

If you want to see a case where every branch, every function call has to be carefully monitored, check this out:

MithrilJS/mithril.js/blob/next/render/render.js#L164-L255

If you want to see another case where I've run into issues with branching and string allocation (in the former) - strings aren't as cheap as you might expect in a number-heavy context - check this out (pardon the TypeScript):

isiahmeadows/enigma/blob/master/src/scanner/string.ts#L240-L282, isiahmeadows/enigma/blob/master/src/scanner/seek.ts

It happens, just not on a regular basis with application-level code. Kind of like Array.prototype.copyWithin - it's a case where in theory, you could do it in-language, but the engine could do it immensely faster.

FWIW, if it removes all occurrences of the items, I'd call it removeAll. For some reason -- I'm not exactly sure why -- I'd expect remove to remove only the first. Perhaps because indexOf/findIndex finds only the first index, find finds only the first match, etc.? And it leaves the door open for a remove that doesn't remove all (though that seems hard to justify if a minimalist standard lib prevails). Web safety of the name would need checking in any case. (Quick check says neither MooTools nor Prototype adds either remove or removeAll to Array.prototype.)

-- T.J. Crowder


Isiah Meadows me at isiahmeadows.com

Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com

# T.J. Crowder (9 months ago)

On Fri, Nov 10, 2017 at 2:17 PM, Isiah Meadows <isiahmeadows at gmail.com> wrote:

Inline. (Hindsight, should've included this in the previous email)

Good deal, that's the kind of thing I was thinking would strengthen the argument.

I think you may have meant to have more after "In nearly every virtual DOM library, in most , the core involves..." ? (Ignore this if the rest of that sentence wasn't important.)

-- T.J. Crowder

# Isiah Meadows (9 months ago)

Yeah...oops. I meant to say something to tge effect of "in nearly ebery virtual DOM library, the core involves a lot of hot loops and array access, and every line of diff code is critical." I just forgot to properly proofread my reply before sending it. :-(

# Alexander Jones (9 months ago)

Small Sets could maybe be represented without the tree?

And everything you say about SIMD could be done too when the set has a viable homogeneous type.

# Jeremy Martin (9 months ago)

I'm not sure where the "don't break the web" threshold is set at (or how that gets evaluated), but I do see a lot of matches for "Array.prototype.remove" on GitHub, at least:

search?l=JavaScript&q="Array.prototype.remove"&type=Code&utf8=✓

There's also an old blog post by John Resig with an example implementation ( johnresig.com/blog/javascript-array-remove), so I'm not sure if that gave rise to any significant adoption or not.

I like the proposal in general, though - just throwing this out there in case other names should be considered (maybe discard?).