RegExp.prototype.count
If performance is an issue, regular expressions are likely to be too slow to begin with. But you could always do this to count the number of lines in a particular string:
var count = 0
var re = /\n|\r\n?/g
while (re.test(str)) count++
console.log(count)
Given it's already this easy to iterate something with a regexp, I'm not convinced it's necessary to add this property/method.
benchmarked @isiah’s while-loop test-case vs str.split vs str.replace for regexp counting on jsperf.com, jsperf.com [1], and the results were surprising (for me).
benchmarks using 1mb random ascii-string from fastest to slowest.
- (fastest - 1,700 runs/sec) regexp-counting with
largeCode.split(/\n/).length - 1 - (40% slower - 1000 runs/sec) regexp-counting with
while-loop (/n/g) - (60% slower - 700 runs/sec) regexp-counting with
largeCode.replace((/[^\n]+/g), "").length
looks like the go-to design-pattern for counting-regexp is str.split(<regexp>).length - 1
[1] regexp counting 2 jsperf.com/regexp
Nit: you should use .spilt(/\n/g) to get all parts.
I like the benchmarks here. That's much appreciated, and after further investigation, I found a giant WTF: jsperf.com/regexp-counting-2/8
TL;DR: for string character counting, prefer indexOf.
For similar reasons to that JSPerf thing, I'd like it to be on the
String prototype rather than the RegExp prototype, as in
str.count(/\n/).
Isiah Meadows contact at isiahmeadows.com, www.isiahmeadows.com
+1 for string.count
i don’t think the g-flag is necessary in str.split, so the original performance claims are still valid:
- for counting regexp - use split + length
- for counting substring - use while + indexOf
a common use-case i have is counting newlines in largish (> 200kb) embedded-js files, like this real-world example [1]. ultimately meant for line-number-preservation purposes in auto-lint/auto-prettify tasks (which have been getting slower due to complexity).
would a new RegExp count-method like
(/\n/g).count(largeCode)be significantly more efficient than existinglargeCode.split("\n").length - 1orlargeCode.replace((/[^\n]+/g), "").length?-kai
[1] calculating and reproducing line-number offsets when linting/autofixing files kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377, kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377
kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586, kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586
a common use-case i have is counting newlines in largish (> 200kb) embedded-js files, like this real-world example [1]. ultimately meant for line-number-preservation purposes in auto-lint/auto-prettify tasks (which have been getting slower due to complexity). would a new RegExp count-method like ```(/\n/g).count(largeCode)``` be significantly more efficient than existing ```largeCode.split("\n").length - 1``` or ```largeCode.replace((/[^\n]+/g), "").length```? -kai [1] calculating and reproducing line-number offsets when linting/autofixing files https://github.com/kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377 <https://github.com/kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7377> https://github.com/kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586 <https://github.com/kaizhu256/node-utility2/blob/2018.12.30/lib.jslint.js#L7586> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20190112/dd294e8c/attachment.html>