Array.prototype.remove(item)
On Fri, Nov 10, 2017 at 10:38 AM, Isiah Meadows <isiahmeadows at gmail.com>
wrote:
My proposal is this: add a new
Array.prototype.remove(item)
method...
So not just a "would be nice," but you've provided a concrete reason for putting it in the standard lib: Providing an optimized version vs. what can be done in userland. Excellent. :-)
In a similar vein, is there a specific reason you're modifying the array in
place vs. creating a new one via filter
? Memory churn? Callback cost
(though that's amazingly low in modern implementations as I'm sure you
know)? Lots of independent references to it? Are any/all things you've had
a real-world problem with around this? (That's a question, not veiled
criticism.)
FWIW, if it removes all occurrences of the items, I'd call it removeAll
.
For some reason -- I'm not exactly sure why -- I'd expect remove
to
remove only the first. Perhaps because indexOf
/findIndex
finds only the
first index, find
finds only the first match, etc.? And it leaves the
door open for a remove
that doesn't remove all (though that seems hard to
justify if a minimalist standard lib prevails). Web safety of the name
would need checking in any case. (Quick check says neither MooTools nor
Prototype adds either remove
or removeAll
to Array.prototype
.)
-- T.J. Crowder
What's wrong with this?
function removeFromArray(array, pred) {
let changed = false;
let j = 0;
for (elt of array) {
if (pred(elt)) changed = true;
else array[j++] = elt;
}
array.length = j;
return changed;
}
Bob
On Fri, Nov 10, 2017 at 11:41 AM, Bob Myers <rtm at gol.com> wrote:
What's wrong with this?
I had the impression he was trying to avoid callbacks, just using ===
.
But other than a missing const
on the for-of
, it looks nice and
efficient -- except that it doesn't seem like for-of
on arrays with the
default iterator is much optimized yet. FWIW:
function removeFromArray(array, item) {
let changed = false;
let j, i, len, elt;
for (j = i = 0, len = array.length; i < len; ++i) {
elt = array[i];
if (elt === item) {
changed = true;
} else {
array[j++] = elt;
}
}
array.length = j;
return changed;
}
Clunkier but apparently we're optimizing for speed...
-- T.J. Crowder
Thanks for your optimization. In one of my library routines I further optimize this with
if (elt === item) {
changed = true;
} else {
if (changed) { array[j] = elt; }
j++;
}
To avoid unnecessary assignments (which might be expensive--I don't know, are they?) while you're still in the portion of the array before the first element to be removed.
Yeah...I probably should've named it removeAll
like I did in my
latest use case, although I still feel it's a good idea to add a
simple remove
to replace the common splice
+ indexOf
idiom,
removing only the first to match.
Isiah Meadows me at isiahmeadows.com
Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com
Also, most generic Array methods are specialized to array-like objects, not iterables. My proposed semantics go along with this.
Isiah Meadows me at isiahmeadows.com
Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com
My proposed semantics does similar, except I just read until it
matches, then I skip it and incrementally copy the rest in a separate
loop (avoiding an extra branch). I delete
the remaining parts in the
polyfill, but you wouldn't need to do that with normal arrays (just
non-array array-likes).
As for performance, branch prediction matters a lot more than local variable assignment, and sequential array iteration usually hits the CPU cache line in any JIT.
Isiah Meadows me at isiahmeadows.com
Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com
Inline. (Hindsight, should've included this in the previous email)
On Fri, Nov 10, 2017 at 5:57 AM, T.J. Crowder <tj.crowder at farsightsoftware.com> wrote:
On Fri, Nov 10, 2017 at 10:38 AM, Isiah Meadows <isiahmeadows at gmail.com> wrote:
My proposal is this: add a new
Array.prototype.remove(item)
method...So not just a "would be nice," but you've provided a concrete reason for putting it in the standard lib: Providing an optimized version vs. what can be done in userland. Excellent. :-)
Oh, and engines could totally use SIMD vectorization in this routine
for dense, homogenous integer, float, or non-numeric arrays, and they
could just emit raw machine code for it. In the case of a first-item
remove
with integers or objects, you could search N at a time until
you find the first occurrence, then from there, do an overlapping
memmove
(which is a good candidate for raw assembly optimization),
and then just chop the last entry off. In the case of a multi-item
removeAll
, you could just do that first step, search N at a time,
repeatedly.
In a similar vein, is there a specific reason you're modifying the array in place vs. creating a new one via
filter
? Memory churn? Callback cost (though that's amazingly low in modern implementations as I'm sure you know)? Lots of independent references to it? Are any/all things you've had a real-world problem with around this? (That's a question, not veiled criticism.)
As stated before, it's in lower-level, more performance-sensitive code. In these cases, memory churn is a very real thing you have to contend with, and in some cases, the allocation of an array or object is itself a potential problem. In nearly every virtual DOM library, in most , the core involves
If you want to see a case where simply fixing type checking and reducing event handler allocation (including multiple C++ round-trips) basically doubled performance in a particular common case in a benchmark, check this diff out:
MithrilJS/mithril.js/pull/1949/files#diff-000bdfae56251b0715111d2b18c9de3cL579
If you want to see a case where every branch, every function call has to be carefully monitored, check this out:
MithrilJS/mithril.js/blob/next/render/render.js#L164-L255
If you want to see another case where I've run into issues with branching and string allocation (in the former) - strings aren't as cheap as you might expect in a number-heavy context - check this out (pardon the TypeScript):
isiahmeadows/enigma/blob/master/src/scanner/string.ts#L240-L282, isiahmeadows/enigma/blob/master/src/scanner/seek.ts
It happens, just not on a regular basis with application-level code.
Kind of like Array.prototype.copyWithin
- it's a case where in
theory, you could do it in-language, but the engine could do it
immensely faster.
FWIW, if it removes all occurrences of the items, I'd call it
removeAll
. For some reason -- I'm not exactly sure why -- I'd expectremove
to remove only the first. Perhaps becauseindexOf
/findIndex
finds only the first index,find
finds only the first match, etc.? And it leaves the door open for aremove
that doesn't remove all (though that seems hard to justify if a minimalist standard lib prevails). Web safety of the name would need checking in any case. (Quick check says neither MooTools nor Prototype adds eitherremove
orremoveAll
toArray.prototype
.)-- T.J. Crowder
Isiah Meadows me at isiahmeadows.com
Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com
On Fri, Nov 10, 2017 at 2:17 PM, Isiah Meadows <isiahmeadows at gmail.com> wrote:
Inline. (Hindsight, should've included this in the previous email)
Good deal, that's the kind of thing I was thinking would strengthen the argument.
I think you may have meant to have more after "In nearly every virtual DOM library, in most , the core involves..." ? (Ignore this if the rest of that sentence wasn't important.)
-- T.J. Crowder
Yeah...oops. I meant to say something to tge effect of "in nearly ebery virtual DOM library, the core involves a lot of hot loops and array access, and every line of diff code is critical." I just forgot to properly proofread my reply before sending it. :-(
Small Sets could maybe be represented without the tree?
And everything you say about SIMD could be done too when the set has a viable homogeneous type.
I'm not sure where the "don't break the web" threshold is set at (or how that gets evaluated), but I do see a lot of matches for "Array.prototype.remove" on GitHub, at least:
search?l=JavaScript&q="Array.prototype.remove"&type=Code&utf8=✓
There's also an old blog post by John Resig with an example implementation ( johnresig.com/blog/javascript-array-remove), so I'm not sure if that gave rise to any significant adoption or not.
I like the proposal in general, though - just throwing this out there in
case other names should be considered (maybe discard
?).
There are cases that you only want to remove the first occurrence of an item (because you know there is at most one occurrence of that item) and there are cases that you want to remove more than one item. Therefore, I propose adding the following methods (written in TypeScript here for static typing support), optimizing for each use case.
class Array<E> {
/**
* Removes the first occurrence of [item] from this array.
*
* Returns `true` if [item] was in this array and `false` otherwise.
*/
remove(item: E): boolean {
const i = this.indexOf(item);
if (i >= 0) {
this.splice(i, 1);
return true;
}
return false;
}
/**
* Removes all items that satisfies the predicate [test].
* @returns The number of removed items.
*/
removeWhere(test: (item: E) => boolean): number {
let count = 0;
for (let i = this.length - 1; i >= 0; i--) {
if (test(this[i])) {
this.splice(i, 1);
count++;
}
}
return count;
}
}
References
- [C#] List.Remove docs.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1.remove?view=netframework-4.7.2
- [C#] List.RemoveAll docs.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1.removeall?view=netframework-4.7.2
- [Dart] List.remove api.dartlang.org/stable/2.0.0/dart-core/List/remove.html
- [Dart] List.removeWhere api.dartlang.org/stable/2.0.0/dart-core/List/removeWhere.html
What's the benefit of adding more array mutator methods? The general idioms of the community have moved towards approaches that create a new object, rather than changing the old one - ie, .filter would be appropriate here.
The benefits are
- efficient (memory & speed)
- clear intent
- concise
There are always use cases where you want to mutate arrays.
How would you rewrite the following deselect
method using filter
?
export class Selector<T> {
private _values: T[] = [];
get values(): ReadonlyArray<T> {
return this._values;
}
/**
* Removes [value] from the list of selected items.
*
* Returns `true` if [value] was previously selected, `false` otherwise.
*/
deselect(value: T): boolean {
if (this._values.remove(value)) {
this.selectionChanged([], [value]);
return true;
}
return false;
}
/**
* Adds [value] to the list of selected items.
*
* Returns `true` if [value] was not previously selected, `false` otherwise.
*/
select(value: T): boolean {
if (this._values.pushIfAbsent(value)) {
this.selectionChanged([value], []);
return true;
}
return false;
}
protected selectionChanged(addedValues, removedValues) {
// Do something such as firing an event.
}
}
For that specific example, I think that a Set is more appropriate:
export class Selector<T> {
private _set = new Set;
get values(): ReadonlyArray<T> {
return Array.from(this._set);
// although it might be better to return an iterator: return this._set.values();
}
deselect(value: T): boolean {
if (this._set.delete(value)) {
this.selectionChanged([], [value]);
return true;
}
return false;
}
// etc.
}
More generally, each time you are tempted to use an add()/remove()/toggle()/contains()/etc() method on an Array, you should ask yourself if it would not be better (more efficient, clearer intent, more concise, ...) to use a Set instead.
The problem with Set
is that its add
method returns this
instead of boolean
. If Set.prototype.add
returned boolean
, I would have used Set
.
That’s why in the select
method of my sample code, I use a custom defined method named pushIfAbsent
. The actual type of _values
is not Array
but a subclass of Array
.
export class MyArray<E> extends Array<E> {
/** Whether this array is empty. */
get isEmpty(): boolean {
return this.length === 0;
}
/**
* Adds [item] to the end of this array if it's not already in this array.
*
* Returns `true` is [item] was added, `false` otherwise.
*/
pushIfAbsent(item: E): boolean {
if (!this.includes(item)) {
this.push(item);
return true;
}
return false;
}
/**
* Removes the first occurrence of [item] from this array.
*
* Returns `true` if [item] was in this array, `false` otherwise.
*/
remove(item: E): boolean {
const i = this.indexOf(item);
if (i >= 0) {
this.splice(i, 1);
return true;
}
return false;
}
}
i don’t have strong opinion on Array.prototype.remove, but i have strong opinion against your use-case to subclass/extend Array. from a product-development perspective, you're creating unnecessary integration-headaches by having the client <-> server system pass around <MyArray> instances instead of plain JSON arrays.
i remain unconvinced of any real-world benefit to subclassing Array, that justifies the cost-added to already painful task of debugging end-to-end communication-issues between client <-> server. your web-project will have a less-painful integration/qa process, if you stick with plain JSON arrays employing static-functions instead:
/*jslint devel*/
(function () {
"use strict";
var myArray1;
function arrayRemoveItem(array, item) {
/**
* This static-function will:
* Remove the first occurrence of [item] from this array.
* Return `true` if [item] was in this array, `false` otherwise.
*/
var i = array.indexOf(item);
if (i >= 0) {
array.splice(i, 1);
return true;
}
return false;
}
myArray1 = [1, 2, 3, 4]; // [1, 2, 3, 4]
console.log(arrayRemoveItem(myArray1, 2)); // true
console.log(myArray1); // [1, 3, 4]
}());
kai zhu kaizhu256 at gmail.com
Le 10 oct. 2018 à 23:17, kai zhu <kaizhu256 at gmail.com> a écrit :
hi Man, i don’t have strong opinion on Array.prototype.remove, but i have strong opinion against your use-case to subclass/extend Array. from a product-development perspective, you're creating unnecessary integration-headaches by having the client <-> server system pass around <MyArray> instances instead of plain JSON arrays.
Not every object in JS is intended to travel between client and server, or to be stored in JSON. And if/when the class want to share some information, it can easily translate its data to and from a JSON-friendly format. (Sorry, I’m not motivated to read the rest of your e-mail.)
Man: add
doesn't need to return a boolean, because it always results in
the item being in the collection after the fact. You could subclass Set,
and make .add
do that, though, if you like! Alternatively, you could use
.has
prior to calling .add
, to get your boolean value.
Not every object in JS is intended to travel between client and server, or to be stored in JSON. And if/when the class want to share some information, it can easily translate its data to and from a JSON-friendly format.
hi Claude, agree to disagree. its impossible to predict/design what should and shouldn’t travel between client <-> server in javascript product-development.
the best you can do as an “architect” is to accept that anything of-consequence in javascript will eventually need to "leak" itself to the "client <-> server <-> db JSON/urlstring dance", over the normal-flow of UX-features getting added to a product; and that much of the tech-debt in these projects arise from needless class constructor/serialization hacks to make that dance happen, when it wasn’t expected (and could’ve been avoided by sticking with plain JSON data-structures).
even in esoteric, seemingly non-web-related cases like using duktape in c++: why would anyone want to have embedded js-capability in such scenario? the only use-case that comes to mind is for manipulating JSON data-structures, and again, passing plain JSON-data between "c++ program" <-> “external web-system"
kai zhu kaizhu256 at gmail.com
You need to cite your sources for the claim that "most of the tech debt" in JavaScript product development is due to accidentally using types other than 20-year-old built-ins and having to figure out the daunting task of JSON serialization.
—— Sandy
That depends entirely on what the return value means. Returning Boolean from add
doesn’t mean, “does the value now exist in the set”, but rather means “was the set modified as a result of this operation”.
To avoid any possible performance cost for calling has
before add
, I usually have to do something like:
function setAdd(set, value) {
const size = set.size;
set.add(value);
return size !== set.size;
}
From: es-discuss <es-discuss-bounces at mozilla.org> On Behalf Of Jordan Harband
Sent: Wednesday, October 10, 2018 7:19 PM To: jolleekin at outlook.com Cc: es-discuss <es-discuss at mozilla.org>
Subject: Re: Array.prototype.remove(item)
Man: add
doesn't need to return a boolean, because it always results in the item being in the collection after the fact. You could subclass Set, and make .add
do that, though, if you like! Alternatively, you could use .has
prior to calling .add
, to get your boolean value.
On Wed, Oct 10, 2018 at 1:01 AM Man Hoang <jolleekin at outlook.com<mailto:jolleekin at outlook.com>> wrote:
The problem with Set
is that its add
method returns this
instead of boolean
. If Set.prototype.add
returned boolean
, I would have used Set
.
That’s why in the select
method of my sample code, I use a custom defined method named pushIfAbsent
. The actual type of _values
is not Array
but a subclass of Array
.
export class MyArray<E> extends Array<E> {
/**
* Adds [item] to the end of this array if it's not already in this array.
*
* Returns `true` is [item] was added, `false` otherwise.
*/
pushIfAbsent(item: E): boolean {
if (!this.includes(item)) {
this.push(item);
return true;
}
return false;
}
/**
* Removes the first occurrence of [item] from this array.
*
* Returns `true` if [item] was in this array, `false` otherwise.
*/
remove(item: E): boolean {
const i = this.indexOf(item);
if (i >= 0) {
this.splice(i, 1);
return true;
}
return false;
}
}
You need to cite your sources
hi Sandy, sure hear are 2 sources:
-
(80% reduction in tech-debt from 5000 -> 1000 sloc) - commit to remove [class-based] Backbone, Handlebars, and other dependencies from fork of swagger-ui [1]. everything was eventually converted to use pure JSON data-structures, to make client <-> server JSON-serialization as painless as possible.
-
(90% reduction in tech-debt from 11,000 -> 1000 sloc) - 3 commits to remove class-dependencies from fork of nedb [2] and rely on pure JSON data-structures to allow easier import/export of persistent db-data
[1] github commit to remove class-dependencies from swagger-ui kaizhu256/node-swgg/commit/95d7c357df083cdf80ba595ae22bf76cb90c5a8c, kaizhu256/node-swgg/commit/95d7c357df083cdf80ba595ae22bf76cb90c5a8c
[2] github commits to remove class-dependencies from nedb kaizhu256/node-db-lite/compare/c6fd675ceaf1104d513183b622a3c6227133a1c7...a30e0b67485f16bed813015adbb6349c481b438d, kaizhu256/node-db-lite/compare/c6fd675ceaf1104d513183b622a3c6227133a1c7...a30e0b67485f16bed813015adbb6349c481b438d
kai zhu kaizhu256 at gmail.com
You need to cite your sources
hi Sandy, sure hear are 2 sources:
So you believe these 2 patches prove "most of the tech debt" across all JavaScript product development is due to this factor.
Huh.
Well, to each their own as far as how "proof" works. I prefer the classical definition. You know, the one where you use the actual population you're generalizing about.
With Array.prototype.remove(item)
Code is clean and concise.
if (array.remove(item)) {
// [item] was removed, do something here.
}
Without Array.prototype.remove(item)
Code is verbose or duplicated (everyone has their own version of removeArrayItem
) => WASTE.
// Using `indexOf` and `splice` directly.
const i = array.indexOf(item);
if (i >= 0) {
array.splice(i, 1);
// [item] was removed, do something here.
}
// Using a helper function.
if (removeArrayItem(array, item)) {
// [item] was removed, do something here.
}
With Set.prototype.add
returning a boolean
Code is clean and concise. Together, add
and delete
are a perfect pair.
if (set.add(item)) {
// [item] was added, do something here.
}
if (set.delete(item)) {
// [item] was deleted, do something here.
}
With Set.prototype.add
returning this
(almost, if not always, useless)
Code is verbose or duplicated (everyone has their own version of addItemToSetIfAbsent
) => WASTE.
// Using `has` and `add`.
if (!set.has(item)) {
set.add(item);
// [item] was added, do something here.
}
// Using `size` and `add`.
const oldSize = set.size;
set.add(item);
if (set.size > oldSize) {
// [item] was added, do something here.
}
// Using a helper function.
if (addItemToSetIfAbsent(set, item)) {
// [item] was added, do something here.
}
My point of view is JS and the web itself were not designed for app development. They have constantly been hacked to do so. If they had been, people wouldn’t have wasted too many resources creating workarounds (not solutions, not technologies) such as jQuery, Ember, Knockout, Polymer, Angular, React, Vue, GWT, CoffeeScript, TypeScript, and so on. No matter how we change JS, its inherent problems remain because we cannot introduce breaking changes. (Please do not argue about this as it will lead to nowhere)
Having said that, let's get back to the main point of this thread and this whole website: proposing ideas to make JS better (or less terrible). For me, changes such as adding Array.prototype.remove(item)
and modifying Set.prototype.add
to return a boolean may seem small but would have a huge impact when multiplied by the number of usages.
One small improvement repeated many times makes a big improvement.
Many small improvements combined make a big improvement.
TBH, for consistency sake with everything else in JS, I'd call the method
delete
, since it's the one that returns true
or false
in pretty much
every other case.
On Thu, Oct 11, 2018 at 10:59 AM Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
TBH, for consistency sake with everything else in JS, I'd call the method
delete
, since it's the one that returnstrue
orfalse
in pretty much every other case.
E.g., in Maps and Sets (and their weak cousins).
-- T.J. Crowder
but also delete obj[key]
... not identical as operation, but with a
boolean return.
That leaves room for eventual Array#drop(item) that returns the item after optionally mutating the array ^_^;;
I find myself frequently using this code to remove an item from an array:
var index = array.indexOf(item) if (index >= 0) array.splice(index, 1)
That's nice, but there's a couple problems:
My proposal is this: add a new
Array.prototype.remove(item)
method that basically does this:true
if any entries were removed,false
otherwise.And in the process, it should take care to do it in an optimized fashion, avoiding the overhead of allocation, excess reads/writes, and excess branching.
In case you're curious why I'm not using
Set
, here's why that's ill-equipped in many cases:It rarely comes up in day-to-day coding, but the difference is substantial when you're writing lower-level data structures and performance-sensitive code. It's probably down a similar vein to
Array.prototype.copyWithin
, which was pretty much designed formemcpy
ormemmove
.Here would be a polyfill of what I'd like to propose (it uses a minimum of writes, by design, and it does everything in one pass):
if (typeof Array.prototype.remove !== "function") { // Private utility const includes = (items, itemLength, entry) => { for (let j = 0; j < itemLength; j++) { if (items[j] === entry) return true } return false } Array.prototype.remove = function (item) { const oldLength = this.length const newLength = arguments.length // This is technically observably no different than const items = [...arguments] let newLength = 0 for (let i = 0; i < oldLength; i++) { const entry = this[i] if (includes(items, newLength, entry)) { let newLength = i++ while (i !== list.length) { const entry = this[i] if (!includes(items, newLength, entry)) { this[newLength++] = entry } i++ } this.length = newLength for (let i = newLength; i < oldLength; i++) delete this[i] return true } } return false } }
If the variadic version wasn't accepted, there's also the non-variadic version:
if (typeof Array.prototype.remove !== "function") { Array.prototype.remove = function (item) { const oldLength = this.length let newLength = 0 for (let i = 0; i < oldLength; i++) { const entry = this[i] if (entry === item) { let newLength = i++ while (i !== list.length) { const entry = this[i] if (entry !== item) this[newLength++] = entry i++ } this.length = newLength for (let i = newLength; i < oldLength; i++) delete this[i] return true } } return false } }
Isiah Meadows me at isiahmeadows.com
Looking for web consulting? Or a new website? Send me an email and we can get started. www.isiahmeadows.com