RegExp.prototype.count
If performance is an issue, regular expressions are likely to be too slow to begin with. But you could always do this to count the number of lines in a particular string:
var count = 0
var re = /\n|\r\n?/g
while (re.test(str)) count++
console.log(count)
Given it's already this easy to iterate something with a regexp, I'm not convinced it's necessary to add this property/method.
benchmarked @isiah’s while-loop test-case vs str.split vs str.replace for regexp counting on jsperf.com, jsperf.com [1], and the results were surprising (for me).
benchmarks using 1mb random ascii-string from fastest to slowest.
- (fastest - 1,700 runs/sec) regexp-counting with
largeCode.split(/\n/).length - 1
- (40% slower - 1000 runs/sec) regexp-counting with
while-loop (/n/g)
- (60% slower - 700 runs/sec) regexp-counting with
largeCode.replace((/[^\n]+/g), "").length
looks like the go-to design-pattern for counting-regexp is str.split(<regexp>).length - 1
[1] regexp counting 2 jsperf.com/regexp
Nit: you should use .spilt(/\n/g)
to get all parts.
I like the benchmarks here. That's much appreciated, and after further investigation, I found a giant WTF: jsperf.com/regexp-counting-2/8
TL;DR: for string character counting, prefer indexOf
.
For similar reasons to that JSPerf thing, I'd like it to be on the
String prototype rather than the RegExp prototype, as in
str.count(/\n/)
.
Isiah Meadows contact at isiahmeadows.com, www.isiahmeadows.com
+1 for string.count
i don’t think the g-flag is necessary in str.split, so the original performance claims are still valid:
- for counting regexp - use split + length
- for counting substring - use while + indexOf
a common use-case i have is counting newlines in largish (> 200kb) embedded-js files, like this real-world example [1]. ultimately meant for line-number-preservation purposes in auto-lint/auto-prettify tasks (which have been getting slower due to complexity).
would a new RegExp count-method like
(/\n/g).count(largeCode)
be significantly more efficient than existinglargeCode.split("\n").length - 1
orlargeCode.replace((/[^\n]+/g), "").length
?-kai
[1] calculating and reproducing line-number offsets when linting/autofixing files kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377, kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377
kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586, kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586