"Approx-equal" operator
On 17.12.2011 20:19, Lasse Reichstein wrote:
On Sat, Dec 17, 2011 at 12:12 PM, Dmitry Soshnikov <dmitry.soshnikov at gmail.com> wrote:
Hi,
Just recently was working with Ruby's code. And found useful again its (actually from Perl) "approximately equal" operator: =~
The operator is just a sugar for `test' method of RegExp.
if (/ecma/.test("ecmascript")) { ... if ("ecmascript" ~= /ecma/) { So you save three characters (one, if you had paren-free invocation). I personally don't find it more readable.
Yep! (and the argument about three characters isn't essential) ;)
It seems obvious and goes without saying that ~= is better than .test(). Perhaps it's just IMO though, I can't insist. I just found it very convenient in other languages.
And the other thing is "RegExp-substringing" with using bracket notation: string[RegExp, startIndex].
"ecmascript"[/ecma/, 0]; // "ecma" That's already valid syntax (stupid code, but valid). The result is "e".
Oh, my bad. Yes, it's already valid. Well, then we may consider other options. Have to think.
This is actually the sugar for:
"ecmascript".match(/ecma/)[0]; // "ecma" You would want to handle the case where match returns null.
Add: String.prototype.get = function(re, n) { var res = re.exec(this); return res ? res[n] : null; }; and you have:
"ecmascript".get(/ecma/, 0) == "ecma"
(feel free to make it non-enumerable).
My fault I described it not clear. In string[regexp, startIndex] is exactly start index -- from where to start search in the string. It's not related to the index of `match' result. Anyway, this syntax is already borrowed.
E.g. a simple lexer:
var code = "var a = 10;" var cursor = 0;
while (cursor< code.length) {
var chunk = code[cursor .. -1]; // sugar for slice: code.slice(cursor,
cursor.length);
if (identifier = chunk[/\A([a-z]\w*)/, 1]) { // handle identifier token } else if (number = chunk[/\A([0-9]+)/, 1]) { // handle numbers }
...
Thoughts? I don't think the advantage of slightly shorter code is worth the extra syntactic complexity from adding two new constructions.
I love these arguments ;) But in fact -- of course it's worth. Especially, if the shortness makes it easier and more convenient.
By the way, are there syntactic complexity for the "~=" operator?
Especially since they only work with RegExps. If it was more generic, in some way, it might be more reasonable to make operators for it.
And it's not even more readable (IMO) than:
var chunk = code.substring(cursor); if (identifier = getMatch(chunk, /\A([a-z]\w*)/, 1)) { // handle identifier token } else if (number = getMatch(chunk, /\A([0-9]+)/, 1])) { // handle numbers
}
Of course, since you already used to. Had people already have such operators, nobody would write these function calls then.
and for efficiency, I'd avoid the substring, and use single invocations of global regexps.
It's already another topic, you may still catch the regexps and with using proposed operators.
This seems like something that can easily be abstracted into a helper function, and come out looking even better.
var code; // some string. var cursor; // a position. var idMatch = /[a-z]\w*/ig; var numMatch = /[0-9]+/g; // ... function check(re, n) { n = n || 0; re.lastIndex = cursor; var res = re.exec(code); if (res) { cursor = re.lastIndex; return res[n]; } return null; }
// and inside some loop: ... if (identifier = check(idMatch, 0)) { // handle identifier } else if (numeral = check(numMatch, 0)) { // handle identifier }
But this might come from me preferring to hide regexps away inside abstractions. Using a RegExp is an implementation detail - it's just one way to find something in a string, and there might be other, and you might want to change implementation over time. Hard-coding regexps into an interface gives them too much exposure, and making extra operators just for regexps also puts too much focus on them.
Yes, this is also true, usually in such cases it's better to abstract things and provide some getters for this. But it was just an example to show the proposal, it's not the talk about lexer implementation.
If the language is built as a text processor, like Perl being heavily influenced by AWK, it makes sense to have RegExps as a primary and preferred feature. In ECMAScript, which has a more general-purpose design, I don't think they should be given preferred treatment. A class with methods is perfectly fine for what they do.
Perhaps, but I don't see why we can't have strong and powerful regexp constructions too.
If ECMAScript had raw strings, the RegExp literal wouldn't even be necessary.
How that? Can you explain?
Dmitry.
On 17.12.2011 21:22, Sam Ruby wrote:
On Sat, Dec 17, 2011 at 6:12 AM, Dmitry Soshnikov <dmitry.soshnikov at gmail.com> wrote:
And the other thing is "RegExp-substringing" with using bracket notation: string[RegExp, startIndex].
"ecmascript"[/ecma/, 0]; // "ecma"
This is actually the sugar for:
"ecmascript".match(/ecma/)[0]; // "ecma" In Ruby it is more than just sugar. The results can be used on the left hand side of an assignment statement.
string='The quick brown fox jumped' string[/\s\w(\w)\w\s/,1] = 'O' puts string
produces "The quick brown fOx jumped"
Yes, this is because strings in Ruby are mutable. But not in ES.
Dmitry.
2011/12/17 Dmitry Soshnikov <dmitry.soshnikov at gmail.com>:
Hi,
Just recently was working with Ruby's code. And found useful again its (actually from Perl) "approximately equal" operator: =~
Perl's =~ operator is more comparable to String.prototype.match than RegExp.prototype.test.
Perl operators can be used in either scalar or list contexts unlike ecmascript (see perldoc for wantarray) and it's true that =~ produces a boolean when used in a scalar context, but since match returns null on zero matches, it can still be used in conditions. When used in a non-scalar context, =~ produces a list of the matches:
perl -e 'my $s = "foo"; print join ",", ($s =~ /o/g)'
prints
o,o
Even ignoring order of evaluation, desugaring a ~= b to b.test(a) would cause the seemingly straightforward
var myRegex = /foo/g; // Used for a global replace in other code. if ("foo" ~= myRegex) { alert(1); } if ("foo" ~= myRegex) { alert(2); } if ("foo" ~= myRegex) { alert(3); }
to alert 1 and 3 only.
On 18.12.2011 23:18, Mike Samuel wrote:
2011/12/17 Dmitry Soshnikov<dmitry.soshnikov at gmail.com>:
Hi,
Just recently was working with Ruby's code. And found useful again its (actually from Perl) "approximately equal" operator: =~ Perl's =~ operator is more comparable to String.prototype.match than RegExp.prototype.test.
Perl operators can be used in either scalar or list contexts unlike ecmascript (see perldoc for wantarray) and it's true that =~ produces a boolean when used in a scalar context, but since match returns null on zero matches, it can still be used in conditions. When used in a non-scalar context, =~ produces a list of the matches:
perl -e 'my $s = "foo"; print join ",", ($s =~ /o/g)'
prints
o,o
Perhaps, I don't know Perl enough. The talk isn't about Perl though. We may adjust the semantics for ES as we wish, not Perl.
Even ignoring order of evaluation, desugaring a ~= b to b.test(a) would cause the seemingly straightforward
var myRegex = /foo/g; // Used for a global replace in other code. if ("foo" ~= myRegex) { alert(1); } if ("foo" ~= myRegex) { alert(2); } if ("foo" ~= myRegex) { alert(3); }
to alert 1 and 3 only.
Because of lastIndex, I understand. Though nobody says we have to directly desugar it to .test(...) method. Obviously, it should handle the case correctly.
Dmitry.
2011/12/18 Dmitry Soshnikov <dmitry.soshnikov at gmail.com>:
On 18.12.2011 23:18, Mike Samuel wrote:
2011/12/17 Dmitry Soshnikov<dmitry.soshnikov at gmail.com>:
Hi,
Just recently was working with Ruby's code. And found useful again its (actually from Perl) "approximately equal" operator: =~
Perl's =~ operator is more comparable to String.prototype.match than RegExp.prototype.test.
Perl operators can be used in either scalar or list contexts unlike ecmascript (see perldoc for wantarray) and it's true that =~ produces a boolean when used in a scalar context, but since match returns null on zero matches, it can still be used in conditions. When used in a non-scalar context, =~ produces a list of the matches:
perl -e 'my $s = "foo"; print join ",", ($s =~ /o/g)'
prints
o,o
Perhaps, I don't know Perl enough. The talk isn't about Perl though. We may adjust the semantics for ES as we wish, not Perl.
I mistakenly assumed that you were claiming that a feature of your proposal was developer familiarity with similar syntax in other languages and that therefore it was worth understanding the semantics of those features in those other languages to avoid developer confusion.
Even ignoring order of evaluation, desugaring a ~= b to b.test(a) would cause the seemingly straightforward
var myRegex = /foo/g; // Used for a global replace in other code. if ("foo" ~= myRegex) { alert(1); } if ("foo" ~= myRegex) { alert(2); } if ("foo" ~= myRegex) { alert(3); }
to alert 1 and 3 only.
Because of lastIndex, I understand. Though nobody says we have to directly desugar it to .test(...) method. Obviously, it should handle the case correctly.
I know that you're just testing the waters before writing a strawman but I can't contribute except by weighing options. In my opinion, as written, a desugaring based on .test is not the way to go.
<bikeshedding>I don't find the syntactic sugar that sweet. ~=
requires me to use the shift-key, so with paren-less calls, match/test/exec is clearer and not much longer.</bikeshedding>
On 19.12.2011 2:26, Mike Samuel wrote:
2011/12/18 Dmitry Soshnikov<dmitry.soshnikov at gmail.com>:
On 18.12.2011 23:18, Mike Samuel wrote:
2011/12/17 Dmitry Soshnikov<dmitry.soshnikov at gmail.com>:
Hi,
Just recently was working with Ruby's code. And found useful again its (actually from Perl) "approximately equal" operator: =~ Perl's =~ operator is more comparable to String.prototype.match than RegExp.prototype.test.
Perl operators can be used in either scalar or list contexts unlike ecmascript (see perldoc for wantarray) and it's true that =~ produces a boolean when used in a scalar context, but since match returns null on zero matches, it can still be used in conditions. When used in a non-scalar context, =~ produces a list of the matches:
perl -e 'my $s = "foo"; print join ",", ($s =~ /o/g)'
prints
o,o
Perhaps, I don't know Perl enough. The talk isn't about Perl though. We may adjust the semantics for ES as we wish, not Perl. I mistakenly assumed that you were claiming that a feature of your proposal was developer familiarity with similar syntax in other languages and that therefore it was worth understanding the semantics of those features in those other languages to avoid developer confusion.
No, no, my fault that I explained not so clear, sorry.
Even ignoring order of evaluation, desugaring a ~= b to b.test(a) would cause the seemingly straightforward
var myRegex = /foo/g; // Used for a global replace in other code. if ("foo" ~= myRegex) { alert(1); } if ("foo" ~= myRegex) { alert(2); } if ("foo" ~= myRegex) { alert(3); }
to alert 1 and 3 only.
Because of lastIndex, I understand. Though nobody says we have to directly desugar it to .test(...) method. Obviously, it should handle the case correctly. I know that you're just testing the waters before writing a strawman but I can't contribute except by weighing options. In my opinion, as written, a desugaring based on .test is not the way to go.
I see, and this is OK since I mentioned the `test', and it's also OK "to test the water" in such case I think, since the feature is new for JS.
<bikeshedding>I don't find the syntactic sugar that sweet. ~= requires me to use the shift-key, so with paren-less calls, match/test/exec is clearer and not much longer.</bikeshedding>
Have we already planned paren-free calls? Seems I missed approved strawman. If so, then yes, we have to think whether we need ~=. After all, yes, it requires Shift to press and keys are in the opposite sides.
Dmitry.
On 19/12/2011, at 10:10, Dmitry Soshnikov wrote:
Have we already planned paren-free calls? Seems I missed approved strawman.
Only for block lambdas, if I'm not mistaken:
<quote>
- Should paren free calls be introduced?
I'm not proposing this in general, and I do not believe anyone else on TC39 will.
/be </quote>
There's this too: esdiscuss/2011-May/014587
<quote> "You're ignoring the goal of providing paren-free block-argument call syntax for control abstractions that look like built-in control-flow statements." </quote>
The thread was "block lambda revival": esdiscuss/2011-May/thread.html#14563
Thanks Jorge, yes, similarly this conclusion was in my memory too.
Dmitry.
So, so? Just to be clear and to close the topic -- do we need it or -- no answer means "no" answer? Don't tell me then (in the future, in an year) that I didn't propose this idea before ;D
Dmitry.
Just recently was working with Ruby's code. And found useful again its (actually from Perl) "approximately equal" operator: =~
The operator is just a sugar for `test' method of RegExp.
if (/ecma/.test("ecmascript")) { console.log("ECMAScript"); }
is sugared into:
if ("ecmascript" =~ /ecma/) { console.log("ECMAScript"); }
Unfortunately, we can't use the same operator "=~" since this pair is already borrowed in ES3. But we may swap them and to use "~=", which reads even more like "approximately equals to"
if ("ecmascript" ~= /ecma/) { console.log("ECMAScript"); }
And the other thing is "RegExp-substringing" with using bracket notation: string[RegExp, startIndex].
"ecmascript"[/ecma/, 0]; // "ecma"
This is actually the sugar for:
"ecmascript".match(/ecma/)[0]; // "ecma"
E.g. a simple lexer:
var code = "var a = 10;" var cursor = 0;
while (cursor < code.length) {
code.slice(cursor, cursor.length);
}
Thoughts?