Octal escape sequences in string and regexp literals

# Claude Pache (10 years ago)

Current web browsers implement octal escape sequences of the form \52, representing the character of code 0o52, in string literals in sloppy mode only, and in regexps (at the condition there is less than 52 capturing groups) in both sloppy and strict mode.

(In order to avoid confusion: I am not concerned with legacy octal integer literals of the form: 052, representing the number 0o52.)

As far as I can infer from archives of es-discuss, these escape sequences was an undesired feature that was not standardised (in ES3), but that everyone implemented and was needed for web compatibility. So, it was decided to exclude it from ES5 strict mode, and therefore from Harmony which was thought to be built on strict mode, for 1JS wasn't invented back then.

Now, times have changed, and, in the sake of 1JS, new features are implemented in both sloppy and strict mode; or otherwise said, the difference between the two modes is kept as small as possible.

From that new perspective, is there still a strong enough reason to exclude these escape sequences from string literals in strict mode, that would justify the discrepency between strict and sloppy modes? And if so, what to do with regexps?

# Caitlin Potter (10 years ago)

I think there are a few reasons why you wouldn't want these.

First and foremost, octal escapes (\nnn) are just an alternative equivalent to hex escapes (\xnn). Most software developers spend a lot more time dealing with hex when it comes to byte values, and very little time with octal literals outside of things like unix file permissions. The most useful octal literal would be \0, and this is already explicitly permitted in strict mode. So, I don't think there's any real compelling use case for the alternative representation of byte values. So to summarize, supporting these in strict mode would be adding another way to accomplish the same given task (which grows the language for no real reason and with no benefit), does not make string literals easier to read and understand, and does not enable software developers to perform any compelling task which was not more easily accomplished using hex literals. Finally, the most common use-case for this feature is already supported in strict mode.

More important, octal escape sequences are a bit liberal, in that they can be of several lengths, with a pretty wide range of delimiters. This, I think, results in many cases where octal escape sequences are used by accident, rather than intentionally. It's a footgun, and ideally that footgun should not be there.

I feel like the "refactoring pain" argument is not very compelling, because I am not convinced beginners are likely to use octal literals on purpose (or even by accident).

# Claude Pache (10 years ago)

Le 2 janv. 2015 à 22:08, Caitlin Potter <caitpotter88 at gmail.com> a écrit :

(...)

More important, octal escape sequences are a bit liberal, in that they can be of several lengths, with a pretty wide range of delimiters. This, I think, results in many cases where octal escape sequences are used by accident, rather than intentionally. It's a footgun, and ideally that footgun should not be there.

Concretely, the danger is that someone could write "\07" when they mean "\0" + "7". This is a good point. (Were you thinking of other cases when you wrote "many cases"?)

I feel like the "refactoring pain" argument is not very compelling, because I am not convinced beginners are likely to use octal literals on purpose (or even by accident).

I agree on that point, and therefore I didn't make any refactoring argument.

# Caitlin Potter (10 years ago)

I agree on that point, and therefore I didn't make any refactoring argument.

I was referring specifically to C12 (and one other, IIRC)

# Caitlin Potter (10 years ago)

(duplicate post)

# Brendan Eich (10 years ago)

Nevertheless, ecmascript#3477#c12 makes a good point. IMHO!

# Claude Pache (10 years ago)

Le 4 janv. 2015 à 00:44, Caitlin Potter <caitpotter88 at gmail.com> a écrit :

I was referring specifically to C12 (and one other, IIRC) on bug 3477

(For reference, bug 3477 C12 is here: ecmascript#3477#c12 )

In that case, I disagree on that point, but I didn't make any refactoring argument. I mean: the two questions I asked in the beginning of the present thread are unrelated to the particular refactoring hazard mentioned in bug 3477 comment 12.