Consistency in The Negative Result Values Through Expansion of null's Role

# Erik Reppen (13 years ago)

This topic has probably been beaten to death years before I was even aware of es-discuss but it continues to get mentioned occasionally as a point of pain so I thought I'd see if I couldn't attempt to hatch a conversation and maybe understand the design concerns better than I likely do now.

Consistent Type Return for Pass and Fail?

The principle of consistent type-return has occasionally skewered me as somebody who came to non-amateur levels of understanding code primarily through JavaScript. I can see the value in maintaining consistent types for positive results but not so much for indicators that you didn't get anything useful. For instance:

  • [0,1].indexOf('wombat'); //returns an index on success or -1 to indicate failure. -1 passed on to a lot of other array methods of course, indicates the last element. If you'd asked me the day I made that mistake I could have told you indexOf probably returns -1 on failure to find something but it didn't occur to me in the moment.

  • 'wombat'.charAt(20); //returns an empty string, but that's a concrete value whereas 'wombat'[20] returns undefined

Is consistent type return a heuristic carried over from more strictly-typed paradigms or would it murder performance of the native methods to do the logic required to return something like null in these cases? In a dynamic language, why not focus on more consistent return types across the board for an indicator that you won't be getting particularly handy results?

Generic Fail Values

I suspect I'm in the minority but I actually like the variety in the more generic negative-result/failure values like undefined, null and NaN since they can help you understand the nature of a problem when they show up in unexpected places but more consistency of implementation and clarity in terms of what they mean would definitely be valuable.

Here's my assumptions about the intent of the following values. Please correct me if I'm wrong:

  • undefined - Makes sense to me as-typically implemented (possibly 100% consistently as I can't think of exceptions). You tried to access something that wasn't there. Only happens when a function actually returns a reference to something holding that value or doesn't define something to return in the first place, or via any property access attempt that doesn't resolve for the indicated property name/label.

  • NaN - Something is expected to evaluate as a number but that's not really possible due to the rules of arithmetic or a type clash. In some cases it seems as if the idea is to return NaN any time a number return was expected but for some reason couldn't be achieved, which as a heuristic doesn't seem like such a hot idea to me.

  • null - Indicates an absence of value. There were no regEx matches in a string, for instance.

How I'd prefer to see them:

  • undefined - as is. It seems like the most consistently implemented of the lot and when I spot an undefined somewhere unexpected it only takes 1-2 guesses to sort out what's going wrong typically.

  • NaN - It can tell you a lot about what kind of thing went wrong but given it's not-equal-to-itself nature it can be a nasty return value when unexpected. For instance, 'wombat'.charCodeAt(20) returns NaN. How does this makes sense in the context of JavaScript? Yes, I'm trying to get a number but from what I would assume (in complete ignorance of unicode evaluation at the lower level) is some sort of look-up table. I'm not trying to divide 'a' by 2, parseInt('a') or get the square root of a negative number. It's as counter-intuitive as indexOf returning a NaN on a failure to find a matching element value. A highly specific return value like NaN only seems ideal to me when the user-placed value responsible is an operand or as a single argument for a simpler method that is one step away from being evaluating the arg as a number or failing to do so.

  • null - As typically implemented but more universally and broadly. I'd like to see null in core methods acting more as a catch-all when dealing with something like a NaN that resulted from operations that don't directly hit a single obvious argument. Essentially a message from core methods telling you, "There's no error but I can't do anything useful with these argumetns" Examples: There is no index for a value that can't be found in an array. No matches were possible with that regEx. A more complicated method that could be attempting to access something in its instance that's not there or have trouble with a number of args runs into trouble and returns null on the principle that it's better to be general than misdirect.

An overly explicitly named method to make my point:

someImaginaryCoreMethodThatGetsAnArrayValueViaSomeArrayKeyAndDividesByTwo(someArrayKey)

So basically when the method takes that array key, gets an undefined value with it, tries to divide undefined by 2, and gets NaN, what's the most helpful return value from a less experienced user's perspective? Is the array key undefined or not a number? Is the array element undefined or not a number?

Or would it be easier to branch your logic consistently when you can only expect to worry about specifics like NaN at an operator context or for very concise/straightforward casting/conversion basic operation methods like parseInt or someMethodThatDividesArgByTwo(arg). And then typically expect null when there are multiple args or multiple steps to a process where a number of things could result in a confusing NaN or other more specific return value to indicate the last thing in an often unobvious cascade of things to go wrong.

With a broader complexity-catch-all-null-on-fail-return policy, branching for these sorts of values becomes simpler and more predictable. NaN is more likely to be something happening in one of my own functions where I can actually see the operation or the very straightforward convert-arg-style core-method call responsible, and undefined is almost always the result of a property typo, a failure to declare a property, or an array index for an element that doesn't exist.

And null results would mean I'm likely dealing with less simple core methods that have more than one argument or multiple-step processes. Since in such cases I'm most likely going to need to check the args I'm passing and the state of the object a method is acting on anyway, why should I worry about whether I should be handling empty strings, NaN, -1, undefined or null when all but null could easily suggest a specific problem that isn't actually the first link in the chain in the first place?

So for the sake of consistency/sanity in future methods, at least, how about establishing the following guidelines somewhere on the usage of these values?

  • More specific negative-result values are reserved for simple statements and very simple one-arg methods that operate directly on the value of your argument
  • Just about anything else returns null for non-positive-result scenarios where more specific returns don't necessarily clarify and could confuse things.
  • Ditch consistent typing approaches if that's not a lower-level perf thing.
# Rick Waldron (13 years ago)

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com> wrote:

This topic has probably been beaten to death years before I was even aware of es-discuss but it continues to get mentioned occasionally as a point of pain so I thought I'd see if I couldn't attempt to hatch a conversation and maybe understand the design concerns better than I likely do now.

Consistent Type Return for Pass and Fail?

The principle of consistent type-return has occasionally skewered me as somebody who came to non-amateur levels of understanding code primarily through JavaScript. I can see the value in maintaining consistent types for positive results but not so much for indicators that you didn't get anything useful. For instance:

  • [0,1].indexOf('wombat'); //returns an index on success or -1 to indicate failure. -1 passed on to a lot of other array methods of course, indicates the last element. If you'd asked me the day I made that mistake I could have told you indexOf probably returns -1 on failure to find something but it didn't occur to me in the moment.

It would be far worse to have a different type of value as a return, right?

  • 'wombat'.charAt(20); //returns an empty string, but that's a concrete value whereas 'wombat'[20] returns undefined

For the same reason indexOf always returns a number, charAt always returns a string.

"wombat"[20] will dereference the string at an index that doesn't exist, which means it's undefined.

Is consistent type return a heuristic carried over from more strictly-typed paradigms or would it murder performance of the native methods to do the logic required to return something like null in these cases? In a dynamic language, why not focus on more consistent return types across the board for an indicator that you won't be getting particularly handy results?

It would break the web.

Generic Fail Values

I suspect I'm in the minority but I actually like the variety in the more generic negative-result/failure values like undefined, null and NaN since they can help you understand the nature of a problem when they show up in unexpected places but more consistency of implementation and clarity in terms of what they mean would definitely be valuable.

Here's my assumptions about the intent of the following values. Please correct me if I'm wrong:

  • undefined - Makes sense to me as-typically implemented (possibly 100% consistently as I can't think of exceptions). You tried to access something that wasn't there. Only happens when a function actually returns a reference to something holding that value or doesn't define something to return in the first place, or via any property access attempt that doesn't resolve for the indicated property name/label.

  • NaN - Something is expected to evaluate as a number but that's not really possible due to the rules of arithmetic or a type clash. In some cases it seems as if the idea is to return NaN any time a number return was expected but for some reason couldn't be achieved, which as a heuristic doesn't seem like such a hot idea to me.

  • null - Indicates an absence of value. There were no regEx matches in a string, for instance.

How I'd prefer to see them:

  • undefined - as is. It seems like the most consistently implemented of the lot and when I spot an undefined somewhere unexpected it only takes 1-2 guesses to sort out what's going wrong typically.

  • NaN - It can tell you a lot about what kind of thing went wrong but given it's not-equal-to-itself nature it can be a nasty return value when unexpected. For instance, 'wombat'.charCodeAt(20) returns NaN. How does this makes sense in the context of JavaScript? Yes, I'm trying to get a number but from what I would assume (in complete ignorance of unicode evaluation at the lower level) is some sort of look-up table. I'm not trying to divide 'a' by 2, parseInt('a') or get the square root of a negative number. It's as counter-intuitive as indexOf returning a NaN on a failure to find a matching element value. A highly specific return value like NaN only seems ideal to me when the user-placed value responsible is an operand or as a single argument for a simpler method that is one step away from being evaluating the arg as a number or failing to do so.

  • null - As typically implemented but more universally and broadly. I'd like to see null in core methods acting more as a catch-all when dealing with something like a NaN that resulted from operations that don't directly hit a single obvious argument. Essentially a message from core methods telling you, "There's no error but I can't do anything useful with these argumetns" Examples: There is no index for a value that can't be found in an array. No matches were possible with that regEx. A more complicated method that could be attempting to access something in its instance that's not there or have trouble with a number of args runs into trouble and returns null on the principle that it's better to be general than misdirect.

An overly explicitly named method to make my point:

someImaginaryCoreMethodThatGetsAnArrayValueViaSomeArrayKeyAndDividesByTwo(someArrayKey)

So basically when the method takes that array key, gets an undefined value with it, tries to divide undefined by 2, and gets NaN, what's the most helpful return value from a less experienced user's perspective? Is the array key undefined or not a number? Is the array element undefined or not a number?

Or would it be easier to branch your logic consistently when you can only expect to worry about specifics like NaN at an operator context or for very concise/straightforward casting/conversion basic operation methods like parseInt or someMethodThatDividesArgByTwo(arg). And then typically expect null when there are multiple args or multiple steps to a process where a number of things could result in a confusing NaN or other more specific return value to indicate the last thing in an often unobvious cascade of things to go wrong.

With a broader complexity-catch-all-null-on-fail-return policy, branching for these sorts of values becomes simpler and more predictable. NaN is more likely to be something happening in one of my own functions where I can actually see the operation or the very straightforward convert-arg-style core-method call responsible, and undefined is almost always the result of a property typo, a failure to declare a property, or an array index for an element that doesn't exist.

And null results would mean I'm likely dealing with less simple core methods that have more than one argument or multiple-step processes. Since in such cases I'm most likely going to need to check the args I'm passing and the state of the object a method is acting on anyway, why should I worry about whether I should be handling empty strings, NaN, -1, undefined or null when all but null could easily suggest a specific problem that isn't actually the first link in the chain in the first place?

So for the sake of consistency/sanity in future methods, at least, how about establishing the following guidelines somewhere on the usage of these values?

  • More specific negative-result values are reserved for simple statements and very simple one-arg methods that operate directly on the value of your argument
  • Just about anything else returns null for non-positive-result scenarios where more specific returns don't necessarily clarify and could confuse things.
  • Ditch consistent typing approaches if that's not a lower-level perf thing.

I'm confident that abandoning existing ES precedents will only create unnecessary confusion.

# David Bruant (13 years ago)

Le 16/08/2012 00:35, Rick Waldron a écrit :

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com <mailto:erik.reppen at gmail.com>> wrote:

Is consistent type return a heuristic carried over from more
strictly-typed paradigms or would it murder performance of the
native methods to do the logic required to return something like
null in these cases? In a dynamic language, why not focus on more
consistent return types across the board for an indicator that you
won't be getting particularly handy results?

It would break the web.

I agree and would like to encourage you (Erik) to read the foreword of my "ECMAScript regrets" project DavidBruant/ECMAScript-regrets#foreword

The only way forward to "fix" broken parts of JavaScript (like making it more consistent) is to give up on JavaScript and create a new language that compiles down to JavaScript. That's my opinion at least.

# Brendan Eich (13 years ago)

David Bruant wrote:

Le 16/08/2012 00:35, Rick Waldron a écrit :

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com <mailto:erik.reppen at gmail.com>> wrote:

Is consistent type return a heuristic carried over from more
strictly-typed paradigms or would it murder performance of the
native methods to do the logic required to return something like
null in these cases? In a dynamic language, why not focus on more
consistent return types across the board for an indicator that
you won't be getting particularly handy results?

It would break the web. I agree and would like to encourage you (Erik) to read the foreword of my "ECMAScript regrets" project DavidBruant/ECMAScript-regrets#foreword

The only way forward to "fix" broken parts of JavaScript (like making it more consistent) is to give up on JavaScript and create a new language that compiles down to JavaScript. That's my opinion at least.

There's another way: extend JS with fixed forms and hope the broken ones die eventually.

Both compile-to-JS and evolve-JS-to-be-better are happening, and the former informs the latter directly (=> and for-of from CoffeeScript are

in ES6, e.g.). Both should, since JS is at this point a huge public good, like it or not.

# David Bruant (13 years ago)

Le 16/08/2012 01:31, Brendan Eich a écrit :

David Bruant wrote:

Le 16/08/2012 00:35, Rick Waldron a écrit :

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com <mailto:erik.reppen at gmail.com>> wrote:

Is consistent type return a heuristic carried over from more
strictly-typed paradigms or would it murder performance of the
native methods to do the logic required to return something like
null in these cases? In a dynamic language, why not focus on more
consistent return types across the board for an indicator that
you won't be getting particularly handy results?

It would break the web. I agree and would like to encourage you (Erik) to read the foreword of my "ECMAScript regrets" project DavidBruant/ECMAScript-regrets#foreword

The only way forward to "fix" broken parts of JavaScript (like making it more consistent) is to give up on JavaScript and create a new language that compiles down to JavaScript. That's my opinion at least.

There's another way: extend JS with fixed forms and hope the broken ones die eventually.

I agree, but "hope" hasn't given the web a lot so far unfortunately.

Also, it should work fine for syntax, but built-in libraries (like what Erik wants to see fixed) seem to be a different story. For instance, Number.isNaN is a fixed form, but I'm not sure it'll be enough to see isNaN die.

Both compile-to-JS and evolve-JS-to-be-better are happening, and the former informs the latter directly (=> and for-of from CoffeeScript are in ES6, e.g.). Both should, since JS is at this point a huge public good, like it or not.

I could not agree more.

# Erik Reppen (13 years ago)

Well I went a bit long as I tend to when geeking out on stuff but I think the topic of how these elements of the language could be used better moving forward if we're stuck with 'em is interesting. I really do find it's helpful in debug to have different ways of saying 'fail.'

With these new JIT interpreters moving as fast as they are, I wonder if it wouldn't be completely ridiculous to eventually attempt to handle multiple versions of JS.

It seems like handling multiple HTML doctypes became completely unworkable for a lot of reasons I've only started to understand, however, so I'm not sure how naive of an idea that is.

What I love about JS is that anything you don't like, you can bend fold, mutilate, and warp into a shape that you do. That's why I'm a bit puzzled by the popularity of the compile-down approaches.

# Norbert Lindenberg (13 years ago)

On Aug 15, 2012, at 15:35 , Rick Waldron wrote:

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com> wrote:

  • 'wombat'.charAt(20); //returns an empty string, but that's a concrete value whereas 'wombat'[20] returns undefined

For the same reason indexOf always returns a number, charAt always returns a string.

"wombat"[20] will dereference the string at an index that doesn't exist, which means it's undefined.

I think undefined would have been a fine return value for 'wombat'.charAt(20) - you asked for something that's not there. The expected return value in the "it's there" case is a one-code-unit string, so an empty string doesn't meet expectations anyway.

  • NaN - It can tell you a lot about what kind of thing went wrong but given it's not-equal-to-itself nature it can be a nasty return value when unexpected. For instance, 'wombat'.charCodeAt(20) returns NaN. How does this makes sense in the context of JavaScript? Yes, I'm trying to get a number but from what I would assume (in complete ignorance of unicode evaluation at the lower level) is some sort of look-up table. I'm not trying to divide 'a' by 2, parseInt('a') or get the square root of a negative number.

In this case NaN is clearly wrong, because what the caller expects is a code unit, an integer between 0 and 0xFFFF. You have to check for NaN just like you have to check for undefined, but undefined would have been the normal JavaScript result for "nothing there".

It's too late to fix charCodeAt, but for the new codePointAt I'm proposing undefined as the "nothing there" result. norbertlindenberg.com/2012/05/ecmascript-supplementary-characters/index.html#String

Norbert

# Andreas Rossberg (13 years ago)

On 16 August 2012 00:35, Rick Waldron <waldron.rick at gmail.com> wrote:

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com> wrote:

This topic has probably been beaten to death years before I was even aware of es-discuss but it continues to get mentioned occasionally as a point of pain so I thought I'd see if I couldn't attempt to hatch a conversation and maybe understand the design concerns better than I likely do now.

Consistent Type Return for Pass and Fail?

The principle of consistent type-return has occasionally skewered me as somebody who came to non-amateur levels of understanding code primarily through JavaScript. I can see the value in maintaining consistent types for positive results but not so much for indicators that you didn't get anything useful. For instance:

  • [0,1].indexOf('wombat'); //returns an index on success or -1 to indicate failure. -1 passed on to a lot of other array methods of course, indicates the last element. If you'd asked me the day I made that mistake I could have told you indexOf probably returns -1 on failure to find something but it didn't occur to me in the moment.

It would be far worse to have a different type of value as a return, right?

Actually, no. It is far better to have something that produces failure as immediately and as reliably as possible when used improperly. Sentinel values are a prominent anti-pattern.

However, I agree that there is no chance of fixing that for existing libraries.

# David Bruant (13 years ago)

Le 16/08/2012 03:11, Erik Reppen a écrit :

Well I went a bit long as I tend to when geeking out on stuff but I think the topic of how these elements of the language could be used better moving forward if we're stuck with 'em is interesting. I really do find it's helpful in debug to have different ways of saying 'fail.'

With these new JIT interpreters moving as fast as they are, I wonder if it wouldn't be completely ridiculous to eventually attempt to handle multiple versions of JS.

It seems like handling multiple HTML doctypes became completely unworkable for a lot of reasons I've only started to understand, however, so I'm not sure how naive of an idea that is.

What I love about JS is that anything you don't like, you can bend fold, mutilate, and warp into a shape that you do. That's why I'm a bit puzzled by the popularity of the compile-down approaches.

You technically can change the shape of the JS environment mostly as you wish, but most of the time, you "socially" can't. It has become the social norm to not touch the environment so that scripts from different sources can "safely" coexist within the same webpage. The only socially accepted exception is polyfills, because they implement a standard feature and so are indistinguishable from a native environment.

At the end of the day, very few people actually change the JS environment to make it as they wish mostly because most people use other people code which has expectations over the JS environment (I think it's probably as much true for node.js because of modules).

Compile-down allows to start fresh with social conventions.

# Rick Waldron (13 years ago)

On Thursday, August 16, 2012 at 5:24 AM, Andreas Rossberg wrote:

On 16 August 2012 00:35, Rick Waldron <waldron.rick at gmail.com> wrote:

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen <erik.reppen at gmail.com> wrote:

This topic has probably been beaten to death years before I was even aware of es-discuss but it continues to get mentioned occasionally as a point of pain so I thought I'd see if I couldn't attempt to hatch a conversation and maybe understand the design concerns better than I likely do now.

Consistent Type Return for Pass and Fail?

The principle of consistent type-return has occasionally skewered me as somebody who came to non-amateur levels of understanding code primarily through JavaScript. I can see the value in maintaining consistent types for positive results but not so much for indicators that you didn't get anything useful. For instance:

  • [0,1].indexOf('wombat'); //returns an index on success or -1 to indicate failure. -1 passed on to a lot of other array methods of course, indicates the last element. If you'd asked me the day I made that mistake I could have told you indexOf probably returns -1 on failure to find something but it didn't occur to me in the moment.

It would be far worse to have a different type of value as a return, right?

Actually, no. It is far better to have something that produces failure as immediately and as reliably as possible when used improperly. Sentinel values are a prominent anti-pattern.

Rereading my initial response now and it's not clear that i meant that changing APIs to return different value would be worse. So yes, I agree with your correction and I apologize for being unclear.

# Erik Reppen (13 years ago)

@Rick Bah, mailing-lists. Sorry about the dupe e-mail. I forgot to reply to all.

I absolutely agree with not permanently altering existing core library methods if you can avoid it. I guess I was thinking more of jQuery's adapter/decorator approach vs the somewhat more extreme (IMO) step of down-compiling to a functional language that lets you alter property context at will. Although of course for stuff like arrow funcs today rather than tomorrow, that's the only option you have and being something of a perpetually recovering curmudgeon/purist I haven't given coffee a fair shake yet.

You could also take a pocket dimension approach where you swap out prototypes in a re-usable function that sets things more to your liking and then cleans up after itself.

myPocketD( function(){ //altered proto methods swapped before this func arg is fired and then swapped back after it closes [0,1].join(''); //alerts 'Somebody did something awful to the Array join method!' //you could pretty much write your entire app in spaces like this and then plug in third party stuff without fear of blowing anything up. });

[0,1].join(''); //returns '01' as expected

# Erik Reppen (13 years ago)

Yeah, I could've been more clear. I didn't expect anybody to rewrite JS in two weeks and the vendors to implement a dual-interpreter that would patch things up with a new 'use erik.preferences'; statement (although if all relevant parties were to offer... I have a birthday next year, that's all I'm saying), but I do wonder if there's some value to these various fail values that could be beneficial if implemented more consistently with new methods moving forward. Or at the very least make null the JS international word for 'nope.'

But why do people consider sentinel values anti-patterns? I've seen some disagreement over this here and there (mostly on stackOverflow) and cursory googling before work hasn't been enough to find good articles on the subject. Wouldn't that be more of a language-specific concern than a general anti-pattern? I haven't spent much time writing anything in an explicit return-type/param-type-declared language but I can see how it would be obnoxious in any paradigm closer to that.

I would think you would either always want to send the same fail signal consistent with the return-context and type of positive return values in a strictly typed languages (e.g. anything returning integers representing array inexes always gives you -1 for fail) or have a universal signal for failure in a more dynamic language like JS. I understand how handing off a variety of types for success could get ugly or be seen as unpredictable in even a more dynamic language but being able to more predictably detect a negative result strikes me as useful for not having to research and write different checks on a per-method basis for somebody who is new to the language or a bit dizzy occasionally like myself.

It's enough of a concern to me that I'd probably consider wrapping every type in my own factories to write adapting methods with more predictable returns if I were working on a fairly extensive library or framework project.

# Peter van der Zee (13 years ago)

On Thu, Aug 16, 2012 at 12:02 AM, Erik Reppen <erik.reppen at gmail.com> wrote:

So for the sake of consistency/sanity in future methods, at least, how about establishing the following guidelines somewhere on the usage of these values?

  • More specific negative-result values are reserved for simple statements and very simple one-arg methods that operate directly on the value of your argument
  • Just about anything else returns null for non-positive-result scenarios where more specific returns don't necessarily clarify and could confuse things.
  • Ditch consistent typing approaches if that's not a lower-level perf thing.

Could introduce a "fail" primitive type whos primitive value is the (possibly empty) message explaining why it/what went wrong. The value would always behave as false except for toString() cases and strict comparison. Could even return false for typeof and use a .isFail() to detect. Probably some more semantics to hash out, but I think you get the gist of it.

I'm sure nobody wants to add another type to the language though ;) It's just an idea.

# Brendan Eich (13 years ago)

Norbert Lindenberg wrote:

On Aug 15, 2012, at 15:35 , Rick Waldron wrote:

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen<erik.reppen at gmail.com> wrote:

  • 'wombat'.charAt(20); //returns an empty string, but that's a concrete value whereas 'wombat'[20] returns undefined

For the same reason indexOf always returns a number, charAt always returns a string.

"wombat"[20] will dereference the string at an index that doesn't exist, which means it's undefined.

I think undefined would have been a fine return value for 'wombat'.charAt(20) - you asked for something that's not there. The expected return value in the "it's there" case is a one-code-unit string, so an empty string doesn't meet expectations anyway.

But you are requiring all clients who take the in-band return value to be string and concatenate it, e.g., to check and avoid cat'ing "undefined".

Having a monotyped return value is common in scripting languages and it avoids requiring callers to write a refutable match. Irrefutable FTW again!

  • NaN - It can tell you a lot about what kind of thing went wrong but given it's not-equal-to-itself nature it can be a nasty return value when unexpected. For instance, 'wombat'.charCodeAt(20) returns NaN. How does this makes sense in the context of JavaScript? Yes, I'm trying to get a number but from what I would assume (in complete ignorance of unicode evaluation at the lower level) is some sort of look-up table. I'm not trying to divide 'a' by 2, parseInt('a') or get the square root of a negative number.

In this case NaN is clearly wrong, because what the caller expects is a code unit, an integer between 0 and 0xFFFF. You have to check for NaN just like you have to check for undefined, but undefined would have been the normal JavaScript result for "nothing there".

You don't always have to check for NaN, again. Code that is testing for a certain code unit using === or even == (because the monotyped return value ensures number) will get guaranteed mismatch with NaN.

It's true that undefined converts ToNumber resulting in NaN, so this case is a bit better than the first one -- no string concatenation issue.

It's too late to fix charCodeAt, but for the new codePointAt I'm proposing undefined as the "nothing there" result. norbertlindenberg.com/2012/05/ecmascript-supplementary-characters/index.html#String

Why is undefined better than NaN? Aesthetics about "nothing there" are not really relevant. You should make a case based on common code patterns and what happens if there is no extra out-of-band return value check (which there need not be in many cases, e.g. === or ==).

# Brendan Eich (13 years ago)

Andreas Rossberg wrote:

On 16 August 2012 00:35, Rick Waldron<waldron.rick at gmail.com> wrote:

On Wed, Aug 15, 2012 at 6:02 PM, Erik Reppen<erik.reppen at gmail.com> wrote:

This topic has probably been beaten to death years before I was even aware of es-discuss but it continues to get mentioned occasionally as a point of pain so I thought I'd see if I couldn't attempt to hatch a conversation and maybe understand the design concerns better than I likely do now.

Consistent Type Return for Pass and Fail?

The principle of consistent type-return has occasionally skewered me as somebody who came to non-amateur levels of understanding code primarily through JavaScript. I can see the value in maintaining consistent types for positive results but not so much for indicators that you didn't get anything useful. For instance:

  • [0,1].indexOf('wombat'); //returns an index on success or -1 to indicate failure. -1 passed on to a lot of other array methods of course, indicates the last element. If you'd asked me the day I made that mistake I could have told you indexOf probably returns -1 on failure to find something but it didn't occur to me in the moment. It would be far worse to have a different type of value as a return, right?

Actually, no. It is far better to have something that produces failure as immediately and as reliably as possible when used improperly. Sentinel values are a prominent anti-pattern.

Spoken like a true ML'er :-P. Also, you mention failure and it's true that lack of try/catch in original-JS and ES1&2 meant APIs had to overload return values with out-of-band failure codes.

However, JS hackers do not write refutable match expressions cracking return value using typeof or better. That is the anti-pattern here.

-1 as an out-of-band (for non-negative indexes) return code is actually easier to test and works well with certain code patterns, compared to a differently typed code. JS is not alone here, tons of precedent even in other dynamic languages that do have try/catch and even matching.

The issue is "failure". Sometimes not finding the wanted character index is not failure but just a condition to test that leads to an alternative strategem. Then the conciseness and clarity of the test matters, and typeof rv != "number"is the wrong tool compared to rv < 0.

However, I agree that there is no chance of fixing that for existing libraries.

That's another thing. Deviating from existing patterns for new APIs that have similar names and motivations is going to be a pain for some users. It's not obvious to me that we have a real "failure must be prompt" problem here that justifies breaking with convention.

# Erik Reppen (13 years ago)

Well, I was thinking !== null for most tests I guess but I could see potential for typeof in the simpler methods that return other stuff.

So by this standard would:

'squirrel'.match(/wombat/);

be better if it returned an empty array rather than null? If that's the case, then I guess I wanted to clear the forest for the sake of a few missing trees.

In what cases does a null-returning method make sense from a language-design-perspective? Or is null also a side-effect of not having try/catch from the start?

# Andreas Rossberg (13 years ago)

On 16 August 2012 23:47, Brendan Eich <brendan at mozilla.org> wrote:

Andreas Rossberg wrote:

On 16 August 2012 00:35, Rick Waldron<waldron.rick at gmail.com> wrote:

It would be far worse to have a different type of value as a return, right?

Actually, no. It is far better to have something that produces failure as immediately and as reliably as possible when used improperly. Sentinel values are a prominent anti-pattern.

Spoken like a true ML'er :-P.

Well, in the pre-ML world I've been brought up in, it was still considered good software engineering practice, independent of language, static or dynamic (and ML has its share of violations, too ;) ).

-1 as an out-of-band (for non-negative indexes) return code is actually easier to test

Why is it easier to compare against constants -1, NaN, or "" than to, say, null or undefined? And if so, wouldn't that be something that might be worth correcting? Question mark operator, anyone?

let index = s.find(...) if (?index) { // equivalent to index !== undefined ... } else { // err... }

and works well with certain code patterns, compared to a differently typed code. JS is not alone here, tons of precedent even in other dynamic languages that do have try/catch and even matching.

There is a lot of precedence in other languages and systems, too (esp of C heritage). At the same time, it also seems to be folklore that it is not the best way to approach API design. And all things equal, wouldn't we want JS to be better than other languages? :)

Ultimately, it boils down to careful API design. For some operations there is a useful neutral value to return in edge cases. For others, there isn't.

The issue is "failure". Sometimes not finding the wanted character index is not failure but just a condition to test that leads to an alternative strategem.

Yes, but it better be failure to forget the test! Returning undefined ensures that relatively reliably, returning a sentinel index doesn't.

Then the conciseness and clarity of the test matters, and typeof rv != "number"is the wrong tool compared to rv < 0.

I agree that typeof is the wrong tool, but rv !== undefined seems reasonable to me.

# Brendan Eich (13 years ago)

Erik Reppen wrote:

Well, I was thinking !== null for most tests I guess but I could see potential for typeof in the simpler methods that return other stuff.

So by this standard would:

'squirrel'.match(/wombat/);

be better if it returned an empty array rather than null?

Probably not. Again, the main consideration is usability. If you want to catch mismatch -- and regexp matchers do, often -- then with an empty array on mismatch, you'd have to write

var m = r. match(s); if (m.length == 0) { /* mismatch handling here */ }

or trickier equivalents, but not

if (!m) { /* mismatch handling here */ }

While empty array does convert to empty string, the match array in general is not used as a concatenation part. Any captured groups would add extra elements to the array, and a /g flag would make all the matches pop out. The "blind concatenation" use-case isn't there, in my experience.

Experience is what drivers good API design, namely user experience and usability. Most match users want to know about mismatch and handle it. Whether mismatch is "failure" or "fallback", it's a fork in the program's control flow, most of the time.

This is much less so the case with charAt.

If that's the case, then I guess I wanted to clear the forest for the sake of a few missing trees.

Systematizing a-priori based on a theory of failure or a one-true OOB return value is probably a mistake. Sometimes -1 works, other times "", yet others null or even a thrown exception. The design depends on the particulars. Channel Aristotle, not Plato.

In what cases does a null-returning method make sense from a language-design-perspective? Or is null also a side-effect of not having try/catch from the start?

Null has its uses. If you think of undefined as "no value" and null as "no object" and use undefined for any-value-returning methods, and null for maybe-object returning methods, you won't go too wrong.

# Brendan Eich (13 years ago)

Erik Reppen wrote:

myPocketD( function(){ //altered proto methods swapped before this func arg is fired and then swapped back after it closes [0,1].join(''); //alerts 'Somebody did something awful to the Array join method!' //you could pretty much write your entire app in spaces like this and then plug in third party stuff without fear of blowing anything up. });

Is join('') awful? It has its uses, along with an identity when you know a string has no non-BMP chars encoded as pairs):

s.split('').join('') === s

Now and then, split('') followed by array processing followed by join('') works best.

Also, building an array and then doing a.join('') to get a string used to be a perf win over +-based string concatenation in IE, at least.

# Erik Reppen (13 years ago)

I think at this time I would definitely favor consistent negative returns in a higher-level user-defined library or framework where the nuts and bolts type-stuff is dealt with for you, but I think I can see the point of not going that route for the core language API. Or at least in any case, I'm confident I can't really make an informed argument/decision until I fill in some of these hedge-mage developer knowledge gaps in to the likely usage-scenarios/patterns/formal-comp-sci-training-"duh"-algorithms being considered for the return values that puzzle me.

Thanks all for the thoughtful replies. It definitely gave me some food for thought and new targets to hit on the self-improvement list.

# Erik Reppen (13 years ago)

Sorry: mailing-list noob and I forget to CC es-discuss.

Nothing wrong with join(''). If anything I'm a bit overly fond of split/join approaches to problems. My only point was that I could do something awful to the join method (like changing it to alert that stupid message regardless of args) only in the context of that function and then swap the original method back in before it interfered with other people's code.

# Brendan Eich (13 years ago)

Andreas Rossberg wrote:

On 16 August 2012 23:47, Brendan Eich<brendan at mozilla.org> wrote:

Andreas Rossberg wrote:

On 16 August 2012 00:35, Rick Waldron<waldron.rick at gmail.com> wrote:

It would be far worse to have a different type of value as a return, right? Actually, no. It is far better to have something that produces failure as immediately and as reliably as possible when used improperly. Sentinel values are a prominent anti-pattern. Spoken like a true ML'er :-P.

Well, in the pre-ML world I've been brought up in, it was still considered good software engineering practice, independent of language, static or dynamic (and ML has its share of violations, too ;) ).

When "failure" means failure, I agree.

-1 as an out-of-band (for non-negative indexes) return code is actually easier to test

Why is it easier to compare against constants -1, NaN, or "" than to, say, null or undefined? And if so, wouldn't that be something that might be worth correcting? Question mark operator, anyone?

let index = s.find(...) if (?index) { // equivalent to index !== undefined ... } else { // err... }

We've toyed with that operator, but my point was that you don't need an if of that kind of you can write a combined index test, e.g. in a cheezy split-like example:

js> var s = "a b c "

js> var a = []

js> var i, j = -1

js> while ((i = s.indexOf(' ', j+1)) >= 0) { a.push(s.slice(j+1, i)); j = i; } 5 js> a ["a", "b", "c"]

In this case, undefined as indexOf sentinel value would work too because it converts to NaN -- but null would not work (it converts to 0). Using an "in-band OOB " (for number type) sentinel wins.

Ultimately, it boils down to careful API design. For some operations there is a useful neutral value to return in edge cases. For others, there isn't.

I hope it's clear from my examples and particular arguments that I agree!

Then the conciseness and clarity of the test matters, and typeof rv != "number"is the wrong tool compared to rv< 0.

I agree that typeof is the wrong tool, but rv !== undefined seems reasonable to me.

See above. Relying on undefined -> NaN could work for numeric contexts,

but (the charAt example) preferring undefined to "" is likelier to make "DuckPigundefinedCow" than "DuckPigCow", ceteris paribus :-P.

# Brandon Benvie (13 years ago)

In light of this thread and the recent discussion about some kind of .? operator, I thought it'd be cool to actually make this and try it out. By having a bit of fine with the V8 API's* MarkAsUndetectable *function and building on top of that, I was able to make a node module as an experiment in seeing it in action. Sadly I couldn't reproduce the behavior in Spidermonkey (the API is unimplemented in v8monkey and I couldn't find a way to do it).

I created an object (called nil) that is marked as undetectable (typeof is 'undefined', in null equality class, falsey) that returns itself for all property access (except for toStrnig), returns itself when called as a function, has null as a prototype, and returns the empty string when coerced to string.

I also implemented a rough proxy wrapper that allows you to wrap an object such that in all places it normally would return undefined instead returns nil. I also created a recursive membrane version that wraps all non-primitive return values with the same kind of handler.

Benvie/nil, npmjs.org/package/nil

# Andreas Rossberg (13 years ago)

On 17 August 2012 20:54, Brendan Eich <brendan at mozilla.org> wrote:

Andreas Rossberg wrote:

On 16 August 2012 23:47, Brendan Eich<brendan at mozilla.org> wrote:

Andreas Rossberg wrote: -1 as an out-of-band (for non-negative indexes) return code is actually easier to test

Why is it easier to compare against constants -1, NaN, or "" than to, say, null or undefined? And if so, wouldn't that be something that might be worth correcting? Question mark operator, anyone?

let index = s.find(...) if (?index) { // equivalent to index !== undefined ... } else { // err... }

We've toyed with that operator, but my point was that you don't need an if of that kind of you can write a combined index test, e.g. in a cheezy split-like example:

js> var s = "a b c " js> var a = [] js> var i, j = -1 js> while ((i = s.indexOf(' ', j+1)) >= 0) { a.push(s.slice(j+1, i)); j = i; } 5 js> a ["a", "b", "c"]

In this case, undefined as indexOf sentinel value would work too because it converts to NaN -- but null would not work (it converts to 0). Using an "in-band OOB " (for number type) sentinel wins.

Hm, I don't see how this example relies on an in-band sentinel. The loop condition would work just as well with a comparison to undefined. Everything else is regular argument values.

FWIW, the example is clearer when written as

var i, j = 0 while ((i = s.indexOf(' ', j)) >= 0) { a.push(s.slice(j, i)); j = i+1; }

But that's equivalent, and your use of "off-by-one" initialization not actually related to sentinels.

# Brendan Eich (13 years ago)

Andreas Rossberg wrote:

Hm, I don't see how this example relies on an in-band sentinel. The loop condition would work just as well with a comparison to undefined. Everything else is regular argument values.

This loop could indeed test

while ((i = s.indexOf(' ', j)) !== undefined) ...

That is unsightly, overlong, and contrived, but never mind.

What I was trying to get to -- and utterly failed to get to last time

# Brendan Eich (13 years ago)

Brendan Eich wrote:

Andreas Rossberg wrote:

Hm, I don't see how this example relies on an in-band sentinel. The loop condition would work just as well with a comparison to undefined. Everything else is regular argument values.

This loop could indeed test

while ((i = s.indexOf(' ', j)) !== undefined) ...

That is unsightly, overlong, and contrived, but never mind.

What I was trying to get to -- and utterly failed to get to last time -- was a use of the -1 return value. Here's what I should have written, a "split on single-char string" that does not rely on trailing separator (i.e. termination):

var s = "a bbb cccc ddddd" var i, j = 0; var a = []; do { i = s.indexOf(' ', j); a.push(s.slice(j, i));// j lags i and slice(j, -1) returns tail } while ((j = i + 1) != 0); print(a);

This prints

a,bbb,cccc,dddd

Sorry, missed my off by one at the end (morning coffee not strong enough!).

Try this:

var s = "a bbb cccc ddddd\n" var i, j = 0; var a = []; do { i = s.indexOf(' ', j); print(j, i); } while ((j = i + 1) != 0); print(a + '!');

The trailing \n in s is supposed to be skipped, so this prints

a,bbb,cccc,ddddd

I will take the charge of "contrived" but still maintain that -1 rather than undefined can be useful (as in used in a further index computation), while undefined either needs a test to special-case (and avoid hard-to-see implicit conversion), or else a conversion to NaN that happens to work (a "good" implicit conversion? Not in your book, I'm sure!).

# Brendan Eich (13 years ago)

Brendan Eich wrote:

I will take the charge of "contrived" but still maintain that -1 rather than undefined can be useful (as in used in a further index computation), while undefined either needs a test to special-case (and avoid hard-to-see implicit conversion), or else a conversion to NaN that happens to work (a "good" implicit conversion? Not in your book, I'm sure!).

Also, the monotyped approach is easier to optimize, all else equal. See qfox.nl/weblog/265 for testimony about this (find "JIT mixed messages").

# Andreas Rossberg (13 years ago)

On 20 August 2012 18:41, Brendan Eich <brendan at mozilla.org> wrote:

This loop could indeed test

while ((i = s.indexOf(' ', j)) !== undefined) ...

That is unsightly, overlong, and contrived, but never mind.

I suppose we have to agree to disagree then. Index = undefined naturally reads as "not present" to me, whereas the purpose of "index

= 0" isn't particularly obvious. So, familiarity bias aside, I

actually find the latter far more contrived.

Sorry, missed my off by one at the end (morning coffee not strong enough!).

Try this:

var s = "a bbb cccc ddddd\n"

var i, j = 0; var a = []; do { i = s.indexOf(' ', j); print(j, i);

} while ((j = i + 1) != 0); print(a + '!');

Hm, something seems to be missing in the correction you pasted.

Anyway, you sort of made my point :). Because in my experience, one problem with "useful" sentinels is that they encourage "clever", i.e., hard to read code, and by extension, buggy code. The gain in conciseness, OTOH, usually is both rare and minor.

Also, the monotyped approach is easier to optimize, all else equal. See qfox.nl/weblog/265 for testimony about this (find "JIT mixed messages").

That's a fair point, although one that shouldn't prematurely affect language design.

Anyway, all this isn't particularly relevant for ES right now, so I'm happy to let it rest.