Why should e.g. '\u2xao' throw an error?
On Sat, Mar 17, 2012 at 11:16 AM, Mathias Bynens <mathias at qiwi.be> wrote:
Why should e.g. '\u2xao' throw an error? I can’t find this in the spec, but Test262 actually has a test for this behavior so I must be missing something obvious.
I know
UnicodeEscapeSequence
is defined as follows:UnicodeEscapeSequence :: u HexDigit HexDigit HexDigit HexDigit
But since
x
is not aHexDigit
, I’d expect '\u2xao' to equal 'u2xao', i.e.\u
is an escape foru
and the rest of the string is nothing special.
The spec doesn't say that "\u" is an escape for 'u'. That's just implementations trying to be lenient rather that tell the user that his program can't be parsed. I.e., it's a language extension.
In strings, not RegExps, the 'g' in "\g" is matched by the production. NonEscapeCharacter :: SourceCharacter but not EscapeCharacter or LineTerminator Since 'u' is an EscapeCharacter, "\u" is not a valid production using NonEscapeCharacter. The only part of the lexical grammar for strings that allow "\u" requires it to be followed by four hex-digits.
I.e., the string "\u2xa0" can't be parsed by the lexical grammar at all, so you have your syntax error.
On the other hand, I agree that I can't point to a place in the spec where it says that you must throw a SyntaxError if you are unable to parse global code. If it's eval code, it throws in 15.1.2.1 step 2, and if it's an argument to the Function constructor, it throws in 15.3.2.1 step 8 (i.e., where a String is parsed at runtime). In all other places, the spec assumes that both lexical and syntactic parsing succeeded, and describes what to do with the result. If it doesn't parse, the input isn't even ECMAScript, so presumably it's the surrounding system that must report an error.
On Mar 17, 2012, at 4:50 AM, Lasse Reichstein wrote:
... On the other hand, I agree that I can't point to a place in the spec where it says that you must throw a SyntaxError if you are unable to parse global code. If it's eval code, it throws in 15.1.2.1 step 2, and if it's an argument to the Function constructor, it throws in 15.3.2.1 step 8 (i.e., where a String is parsed at runtime). In all other places, the spec assumes that both lexical and syntactic parsing succeeded, and describes what to do with the result. If it doesn't parse, the input isn't even ECMAScript, so presumably it's the surrounding system that must report an error.
see section 16. Any syntax error is an "early error". The process used to report early errors for a ECMAScript Program production is implementation defined (see note in section 14) except for the case you mention for eval
Why should e.g. '\u2xao' throw an error? I can’t find this in the spec, but Test262 actually has a test for this behavior so I must be missing something obvious.
I know
UnicodeEscapeSequence
is defined as follows:But since
x
is not aHexDigit
, I’d expect '\u2xao' to equal 'u2xao', i.e.\u
is an escape foru
and the rest of the string is nothing special.Thanks in advance, Mathias