ES4 draft: Triply quoted string literals
Looks good to me.
Geoff
I'm not sure what the intent is, but as this is written:
"""abc""""def"""
will evaluate to the same string as 'abc""""def'.
Furthermore,
""""""
turns into:
\
to which we're then supposed to apply escape processing, but that is not possible because there is no character following the backslash. What happens?
What does the following evaluate to?
"""\n\"\t"""
Is it the same as "\n\"\t"?
If so, then triple quoting seems like an extraneous feature, as we still need to go through and double every backslash located inside the string.
Waldemar
On 04/03/2008, Waldemar Horwat <waldemar at google.com> wrote:
I'm not sure what the intent is, but as this is written:
"""abc""""def"""
will evaluate to the same string as 'abc""""def'.
That's not how I read the spec. As I read it, it will evaluate to the same string as 'abc"', followed by a nonsensical def""" which should trigger a syntax error.
Furthermore,
""""""
turns into:
\
to which we're then supposed to apply escape processing, but that is not possible because there is no character following the backslash. What happens?
No it doesn't. That is three quotes followed by an escaped quote followed by two quotes (which do not end the triply quoted string), as I read the spec.
What does the following evaluate to?
"""\n\"\t"""
Is it the same as "\n\"\t"?
Looks to be so, yeah.
If so, then triple quoting seems like an extraneous feature, as we still need to go through and double every backslash located inside the string.
Seems so to me.
-----Original Message----- From: Waldemar Horwat [mailto:waldemar at google.com] Sent: 4. mars 2008 02:25 To: Lars Hansen Cc: es4-discuss Discuss Subject: Re: ES4 draft: Triply quoted string literals
I'm not sure what the intent is, but as this is written:
"""abc""""def"""
will evaluate to the same string as 'abc""""def'.
It will not. The text of the spec is "The literal is terminated by the earliest sequence of three unescaped instances of the the same quote character that is not followed by a fourth quote character of the same kind." So the string is the four-letter sequence abc". Perhaps the sentence would be even clearer if the word "immediately" were to precede the word "followed".
Furthermore,
""""""
turns into:
\
It does not, this is an unterminated triple-quoted string according to the rule above.
What does the following evaluate to?
"""\n\"\t"""
The four character string U+000A U+005C U+0022 U+0009.
Is it the same as "\n\"\t"?
It is.
It is also the same as """<LF>\"<TAB>""" where <LF> denotes a literal
U+000A (or U+000D or U+000D U+000A pair, cf the proposal on line terminator normalization). Note the unescaped quote character.
If so, then triple quoting seems like an extraneous feature, as we still need to go through and double every backslash located inside the string.
The primary purpose of triple quoting is not to avoid escaping backslashes, but to avoid escaping quotes and to avoid having to convert literal line breaks to \n characters. There is a small amount of discussion on the original proposal page (reference [1] from the spec page).
(I too like the r"..." syntax of Python, but that is not what we have here.)
Lars Hansen wrote:
I'm not sure what the intent is, but as this is written:
"""abc""""def"""
will evaluate to the same string as 'abc""""def'.
It will not. The text of the spec is "The literal is terminated by the earliest sequence of three unescaped instances of the the same quote character that is not followed by a fourth quote character of the same kind." So the string is the four-letter sequence abc". Perhaps the sentence would be even clearer if the word "immediately" were to precede the word "followed".
From the discussions on the list I knew what you meant, but as written this is ambiguous. I read it as a string of four or more quote characters not being eligible to be a terminator, so you skip it and look for the next sequence.
Waldemar
Waldemar Horwat wrote:
I'm not sure what the intent is, but as this is written:
"""abc""""def"""
will evaluate to the same string as 'abc""""def'.
It will not. The text of the spec is "The literal is terminated by the earliest sequence of three unescaped instances of the the same quote character that is not followed by a fourth quote character of the same kind." So the string is the four-letter sequence abc". Perhaps the sentence would be even clearer if the word "immediately" were to precede the word "followed".
From the discussions on the list I knew what you meant, but as written this is ambiguous. I read it as a string of four or more quote characters not being eligible to be a terminator, so you skip it and look for the next sequence.
With the same argumentation, you could argue that the next triple quote sequence is nothing but a double quote, followed by a single quote. Writing a lexer that recognizes a triple quoted string like yours above in the way you described could quickly become very complicated - never to mention the human mind, who would struggle to recognize a proper string termination in such source code as well.
I strongly favor Lars' point of view, where above string should be considered a syntax error.
Michael
Given the apparent lack of enthusiasm for this proposal, and the call for a new design voiced in last night's phone conference, I propose that we simply reject the triply-quoted string proposal.
IMO a minor adjustment to the triply-quoted string proposal to address its shortcomings would be reasonable, but I do not think that it is appropriate to propose a new, competing design at this time.
On Mar 5, 2008, at 10:02 AM, Lars Hansen wrote:
Given the apparent lack of enthusiasm for this proposal, and the call for a new design voiced in last night's phone conference, I propose
that we simply reject the triply-quoted string proposal.IMO a minor adjustment to the triply-quoted string proposal to address its shortcomings would be reasonable, but I do not think that it is appropriate to propose a new, competing design at this time.
I agree, and I certainly wasn't going back to the drawing board. The
only issues AFAIK were
-
Do we need this in light of the line continuations support? But
continuations are not enough, and ugly to boot: backslash-escaped
newlines in strings are elided, so there's no newline left as a
character in the string. Triple quoting does give verbatim multiline
string literals where continuations do not. -
Do we want any backslash-escaping other than the quote character in
the string? I said I would check the Python design docs and code.
These were more on the order of doubting the utility of triple-
quoting, and asking whether it matches Python. Not requests for a
brand-new and different spec.
Thanks for the clarification. Tickets/clarifications on these issues are IMO in scope.
Please comment.