String.prototype.split fixed fields extension
This seems special purpose enough (as you say, for legacy formats) and easy enough to implement, that it probably doesn't warrant being included in the language.
I just spent some time researching TLE and it appears that no data value will ever have a space within the value itself - making a space the delimiter.
Anyway, most languages include a string chunk function that returns an array.
(In the case of TLE the fields themselves don't contain space but some fields run into each other, e.g. "Mean Motion" and "Revolution Number", and so do the two "Checksums" -- if wikipedia is to be believed.)
This additional 'split' functionality is a sort of multi-slice, and String.prototype.slice is generally useful. It is easy enough to implement, but then so is isNaN, isFinite, Date.now, and Array.isArray (a careful toString == "[object Array]"), etc, etc.
Is it useful enough, worth rounding out the 'split' function's capabilities?
From: Rick Waldron
I just spent some time researching TLE and it appears that no data value will ever have a space within the value itself - making a space the delimiter.
Anyway, most languages include a string chunk function that returns an array.
-Rick
On Saturday, March 24, 2012 at 1:56 PM, Russell Leggett wrote:
This seems special purpose enough (as you say, for legacy formats) and easy enough to implement, that it probably doesn't warrant being included in the language.
- Russ
On Mar 23, 2012, at 5:36 PM, "Roger Andrews" <roger.andrews at mail104.co.uk> wrote:
String.prototype.split is good for cutting records into fields based on a
delimiter string or regexp. E.g.
rec.split( ',' ) // split CSV record (no commas in fields)
rec.split( /\s+/ ) // split into whitespace-separated fields
How about extending 'split', or inventing a new method 'splitlen', which
splits a record into defined-length fields? This simplifies a long list of
'substring's.
Old data formats invented in the days of punch-cards are still around. For
example NASA's two-line element set
(http://en.wikipedia.org/wiki/Two-line_element_set)
which records the orbital elements of Earth satellites.
E.g. here is the TLE for for International Space Station:
ISS (ZARYA)
1 25544U 98067A 08264.51782528 -.00002182 00000-0 -11606-4 0 2927
2 25544 51.6416 247.4627 0006703 130.5360 325.0288 15.72125391563537
Proposed design:
split( len1, len2, len3, len4, ..... ) // returns array of fields
where each numeric length argument either
(1) captures a field of the given length if positive, or
(2) ignores a field of the absolute given length if negative.
The special argument "*" could repeat the previous argument to the end of
the record.
Examples:
// chop into 5-char fields
rec.split( 5, "*" )
// capture a 1-char and a 5-char field and all chars after index 17
rec.split( -7, 1, 5, -4, Infinity )
_______________________________________________
es-discuss mailing list
es-discuss at mozilla.org
https://mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es
I think being able to chop a string into fixed-length segments would be useful and general purpose enough to consider if there wasn’t already a simple way to do so. E.g., do this for length 4:
str.match(/[\s\S]{1,4}/g)
Note that the quantifier is greedy so it favors length 4, but allowing 1+ picks up the slack at the end.
As for being able to specify any number of lengths and segments to skip, I think Russel Leggett was correct in saying that it’s too special purpose to justify.
-- Steven Levithan
From: Roger Andrews Sent: Sunday, March 25, 2012 7:52 PM To: Rick Waldron ; Russell Leggett Cc: es-discuss at mozilla.org Subject: Re: String.prototype.split fixed fields extension
(In the case of TLE the fields themselves don't contain space but some fields run into each other, e.g. "Mean Motion" and "Revolution Number", and so do the two "Checksums" -- if wikipedia is to be believed.)
This additional 'split' functionality is a sort of multi-slice, and String.prototype.slice is generally useful. It is easy enough to implement, but then so is isNaN, isFinite, Date.now, and Array.isArray (a careful toString == "[object Array]"), etc, etc.
Is it useful enough, worth rounding out the 'split' function's capabilities?
From: Rick Waldron
I just spent some time researching TLE and it appears that no data value will ever have a space within the value itself - making a space the delimiter.
Anyway, most languages include a string chunk function that returns an array.
-Rick
On Saturday, March 24, 2012 at 1:56 PM, Russell Leggett wrote:
This seems special purpose enough (as you say, for legacy formats) and easy enough to implement, that it probably doesn't warrant being included in the language.
- Russ
On Mar 23, 2012, at 5:36 PM, "Roger Andrews" <roger.andrews at mail104.co.uk> wrote:
String.prototype.split is good for cutting records into fields based on a
delimiter string or regexp. E.g.
rec.split( ',' ) // split CSV record (no commas in fields)
rec.split( /\s+/ ) // split into whitespace-separated fields
How about extending 'split', or inventing a new method 'splitlen', which
splits a record into defined-length fields? This simplifies a long list of
'substring's.
Old data formats invented in the days of punch-cards are still around. For
example NASA's two-line element set
(http://en.wikipedia.org/wiki/Two-line_element_set)
which records the orbital elements of Earth satellites.
E.g. here is the TLE for for International Space Station:
ISS (ZARYA)
1 25544U 98067A 08264.51782528 -.00002182 00000-0 -11606-4 0 2927
2 25544 51.6416 247.4627 0006703 130.5360 325.0288 15.72125391563537
Proposed design:
split( len1, len2, len3, len4, ..... ) // returns array of fields
where each numeric length argument either
(1) captures a field of the given length if positive, or
(2) ignores a field of the absolute given length if negative.
The special argument "*" could repeat the previous argument to the end of
the record.
Examples:
// chop into 5-char fields
rec.split( 5, "*" )
// capture a 1-char and a 5-char field and all chars after index 17
rec.split( -7, 1, 5, -4, Infinity )
_______________________________________________
es-discuss mailing list
es-discuss at mozilla.org
https://mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es
Thanks for that. I had neglected to think of the power of RegExps to capture chunks of a string arbitrarily.
BTW Is there a reason for using [\s\S] instead of [^] to mean: any char including CR & LF? ([^] being the inverse of the empty set, and 'dot' not matching CR & LF, of course.)
From: Steven Levithan Sent: Monday, March 26, 2012 6:45 PM To: Roger Andrews ; Rick Waldron ; Russell Leggett Cc: es-discuss at mozilla.org Subject: Re: String.prototype.split fixed fields extension
I think being able to chop a string into fixed-length segments would be useful and general purpose enough to consider if there wasn’t already a simple way to do so. E.g., do this for length 4:
str.match(/[\s\S]{1,4}/g)
Note that the quantifier is greedy so it favors length 4, but allowing 1+ picks up the slack at the end.
As for being able to specify any number of lengths and segments to skip, I think Russel Leggett was correct in saying that it’s too special purpose to justify.
-- Steven Levithan
From: Roger Andrews Sent: Sunday, March 25, 2012 7:52 PM To: Rick Waldron ; Russell Leggett Cc: es-discuss at mozilla.org Subject: Re: String.prototype.split fixed fields extension
(In the case of TLE the fields themselves don't contain space but some fields run into each other, e.g. "Mean Motion" and "Revolution Number", and so do the two "Checksums" -- if wikipedia is to be believed.)
This additional 'split' functionality is a sort of multi-slice, and String.prototype.slice is generally useful. It is easy enough to implement, but then so is isNaN, isFinite, Date.now, and Array.isArray (a careful toString == "[object Array]"), etc, etc.
Is it useful enough, worth rounding out the 'split' function's capabilities?
From: Rick Waldron
I just spent some time researching TLE and it appears that no data value will ever have a space within the value itself - making a space the delimiter.
Anyway, most languages include a string chunk function that returns an array.
-Rick
On Saturday, March 24, 2012 at 1:56 PM, Russell Leggett wrote:
This seems special purpose enough (as you say, for legacy formats) and easy enough to implement, that it probably doesn't warrant being included in the language.
- Russ
On Mar 23, 2012, at 5:36 PM, "Roger Andrews" <roger.andrews at mail104.co.uk> wrote:
String.prototype.split is good for cutting records into fields based on a
delimiter string or regexp. E.g.
rec.split( ',' ) // split CSV record (no commas in fields)
rec.split( /\s+/ ) // split into whitespace-separated fields
How about extending 'split', or inventing a new method 'splitlen', which
splits a record into defined-length fields? This simplifies a long list of
'substring's.
Old data formats invented in the days of punch-cards are still around. For
example NASA's two-line element set
(http://en.wikipedia.org/wiki/Two-line_element_set)
which records the orbital elements of Earth satellites.
E.g. here is the TLE for for International Space Station:
ISS (ZARYA)
1 25544U 98067A 08264.51782528 -.00002182 00000-0 -11606-4 0 2927
2 25544 51.6416 247.4627 0006703 130.5360 325.0288 15.72125391563537
Proposed design:
split( len1, len2, len3, len4, ..... ) // returns array of fields
where each numeric length argument either
(1) captures a field of the given length if positive, or
(2) ignores a field of the absolute given length if negative.
The special argument "*" could repeat the previous argument to the end of
the record.
Examples:
// chop into 5-char fields
rec.split( 5, "*" )
// capture a 1-char and a 5-char field and all chars after index 17
rec.split( -7, 1, 5, -4, Infinity )
_______________________________________________
es-discuss mailing list
es-discuss at mozilla.org
https://mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es
Yes. In at least IE8-minus and older versions of Safari, [^] is an unclosed character class (and therefore a syntax error) since an unescaped ] is allowed as the first character in a character class. I.e., [^]] eq [^]]. This followed longstanding regex tradition that ECMAScript either forgot to include or intentionally made a break from. Acid3 included an explicit test that [] is an empty but complete character class, and since then all modern browsers have adopted this ES3 rule.
-- Steven Levithan
From: Roger Andrews Sent: Monday, March 26, 2012 7:13 PM To: Steven Levithan ; Rick Waldron ; Russell Leggett Cc: es-discuss at mozilla.org Subject: Re: String.prototype.split fixed fields extension
Thanks for that. I had neglected to think of the power of RegExps to capture chunks of a string arbitrarily.
BTW Is there a reason for using [\s\S] instead of [^] to mean: any char including CR & LF? ([^] being the inverse of the empty set, and 'dot' not matching CR & LF, of course.)
From: Steven Levithan Sent: Monday, March 26, 2012 6:45 PM To: Roger Andrews ; Rick Waldron ; Russell Leggett Cc: es-discuss at mozilla.org Subject: Re: String.prototype.split fixed fields extension
I think being able to chop a string into fixed-length segments would be useful and general purpose enough to consider if there wasn’t already a simple way to do so. E.g., do this for length 4:
str.match(/[\s\S]{1,4}/g)
Note that the quantifier is greedy so it favors length 4, but allowing 1+ picks up the slack at the end.
As for being able to specify any number of lengths and segments to skip, I think Russel Leggett was correct in saying that it’s too special purpose to justify.
-- Steven Levithan
From: Roger Andrews Sent: Sunday, March 25, 2012 7:52 PM To: Rick Waldron ; Russell Leggett Cc: es-discuss at mozilla.org Subject: Re: String.prototype.split fixed fields extension
(In the case of TLE the fields themselves don't contain space but some fields run into each other, e.g. "Mean Motion" and "Revolution Number", and so do the two "Checksums" -- if wikipedia is to be believed.)
This additional 'split' functionality is a sort of multi-slice, and String.prototype.slice is generally useful. It is easy enough to implement, but then so is isNaN, isFinite, Date.now, and Array.isArray (a careful toString == "[object Array]"), etc, etc.
Is it useful enough, worth rounding out the 'split' function's capabilities?
From: Rick Waldron
I just spent some time researching TLE and it appears that no data value will ever have a space within the value itself - making a space the delimiter.
Anyway, most languages include a string chunk function that returns an array.
-Rick
On Saturday, March 24, 2012 at 1:56 PM, Russell Leggett wrote:
This seems special purpose enough (as you say, for legacy formats) and easy enough to implement, that it probably doesn't warrant being included in the language.
- Russ
On Mar 23, 2012, at 5:36 PM, "Roger Andrews" <roger.andrews at mail104.co.uk> wrote:
String.prototype.split is good for cutting records into fields based on a
delimiter string or regexp. E.g.
rec.split( ',' ) // split CSV record (no commas in fields)
rec.split( /\s+/ ) // split into whitespace-separated fields
How about extending 'split', or inventing a new method 'splitlen', which
splits a record into defined-length fields? This simplifies a long list of
'substring's.
Old data formats invented in the days of punch-cards are still around. For
example NASA's two-line element set
(http://en.wikipedia.org/wiki/Two-line_element_set)
which records the orbital elements of Earth satellites.
E.g. here is the TLE for for International Space Station:
ISS (ZARYA)
1 25544U 98067A 08264.51782528 -.00002182 00000-0 -11606-4 0 2927
2 25544 51.6416 247.4627 0006703 130.5360 325.0288 15.72125391563537
Proposed design:
split( len1, len2, len3, len4, ..... ) // returns array of fields
where each numeric length argument either
(1) captures a field of the given length if positive, or
(2) ignores a field of the absolute given length if negative.
The special argument "*" could repeat the previous argument to the end of
the record.
Examples:
// chop into 5-char fields
rec.split( 5, "*" )
// capture a 1-char and a 5-char field and all chars after index 17
rec.split( -7, 1, 5, -4, Infinity )
_______________________________________________
es-discuss mailing list
es-discuss at mozilla.org
https://mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org, mail.mozilla.org/listinfo/es
String.prototype.split is good for cutting records into fields based on a delimiter string or regexp. E.g. rec.split( ',' ) // split CSV record (no commas in fields) rec.split( /\s+/ ) // split into whitespace-separated fields
How about extending 'split', or inventing a new method 'splitlen', which splits a record into defined-length fields? This simplifies a long list of 'substring's.
Old data formats invented in the days of punch-cards are still around. For example NASA's two-line element set (en.wikipedia.org/wiki/Two-line_element_set) which records the orbital elements of Earth satellites. E.g. here is the TLE for for International Space Station: ISS (ZARYA) 1 25544U 98067A 08264.51782528 -.00002182 00000-0 -11606-4 0 2927 2 25544 51.6416 247.4627 0006703 130.5360 325.0288 15.72125391563537
Proposed design: split( len1, len2, len3, len4, ..... ) // returns array of fields where each numeric length argument either (1) captures a field of the given length if positive, or (2) ignores a field of the absolute given length if negative. The special argument "*" could repeat the previous argument to the end of the record.
Examples: // chop into 5-char fields rec.split( 5, "*" ) // capture a 1-char and a 5-char field and all chars after index 17 rec.split( -7, 1, 5, -4, Infinity )