Internationalization API: spec comments.

# Eric Albright (13 years ago)

Comments on behalf of Suresh and me:

Section 6.2.4 DefaultLocale

o We would like to be able to return a priority list here and not just a single item. While many implementations may have just a single item, those that are built on top of Windows 8 will have available a list of languages that the user has declared to understand instead of just a default user locale that we have had up until now. Since this is internal, I believe changing the name to DefaultLocales and returning an array will allow implementations to do either:

The DefaultLocales abstract operation returns an array of string values representing a priority list of the structurally valid (6.2.2) and canonicalized (6.2.3) BCP 47 language tags for the host environment in descending order of priority. This may return a single item representing the current locale.

  •     Section 12.1.1 InitializeNumberFormat
    

o Step 10: When initializing NumberFormat, the internal property [[locale]] should point to [[dataLocale]] or [[locale]] retruned by ResolveLocale abstract operation. For e.g. ResolveLocale returns the following object for the supplied language tag ta-IN-u-nu-tamldec

{ _dataLocale : "en-US", _nu : "tamldec", _locale : "en-US-u-nu-tamldec" }

Should the [[locale]] of NumberFormat point to _dataLocale or _locale ? According to the spec it is _locale. I think it should point to _dataLocale.

o Step 31: The default value for UseGrouping is set to true. Shouldn't it be implementation specific default ?

  •     Section 13.3.2 Intl.DateTimeFormat.prototype.format
    

o Step 5 & 6 : NumberFormat is constructed without any options. Since this will construct the NF object with UseGrouping

# Norbert Lindenberg (13 years ago)

and Suresh,

Thanks for the comments! Some replies below, but obviously we should discuss in more detail on Monday.

Norbert

On May 18, 2012, at 17:00 , Eric Albright wrote:

Comments on behalf of Suresh and me:

Section 6.2.4 DefaultLocale o We would like to be able to return a priority list here and not just a single item. While many implementations may have just a single item, those that are built on top of Windows 8 will have available a list of languages that the user has declared to understand instead of just a default user locale that we have had up until now. Since this is internal, I believe changing the name to DefaultLocales and returning an array will allow implementations to do either:

The DefaultLocales abstract operation returns an array of string values representing a priority list of the structurally valid (6.2.2) and canonicalized (6.2.3) BCP 47 language tags for the host environment in descending order of priority. This may return a single item representing the current locale.

To me the default locale of an implementation and the user's preferred languages are two different things. The default locale must be supported and a single locale so that we can fall back to it when none of the requested locales are supported. The user's preferred languages can be several, and they may include languages that an implementation doesn't support.

Also, while on the client side information about the user's preferred languages may be obtained from the operating system (or from browser preferences), that doesn't work on the server side. Instead, there they come as part of a request (query parameters, subdomain names, or Accept-Language) or are obtained from databases or web services.

· Section 12.1.1 InitializeNumberFormat o Step 10: When initializing NumberFormat, the internal property [[locale]] should point to [[dataLocale]] or [[locale]] retruned by ResolveLocale abstract operation. For e.g. ResolveLocale returns the following object for the supplied language tag ta-IN-u-nu-tamldec

{ _dataLocale : "en-US", _nu : "tamldec", _locale : "en-US-u-nu-tamldec" }

Should the [[locale]] of NumberFormat point to _dataLocale or _locale ? According to the spec it is _locale. I think it should point to _dataLocale.

Do you mean the current spec doesn't reflect what was discussed before, or that we made the wrong choice?

At this point, [[locale]] is specifically constructed to give the application information about the parts of its requested language tag that are supported and are going to be used. This was first proposed by Rich Gillam in the 2011-07-26 meeting, and documented in section 6.1.2 of my first spec draft on 2011-08-10 norbertlindenberg.com/ecmascript/internationalization-api.html It then got discussed in the meeting on 2011-08-17, where the minutes have the hint "u-co-kn=no => should show up in ROpt and RLT" and a colorful table just above:

docs.google.com/document/pub?id=1-NytPBbsO7dLvt0C2psJkF1Wtt3QdKF3NQAHksF5fyc

o Step 31: The default value for UseGrouping is set to true. Shouldn’t it be implementation specific default ?

Not according to what I proposed and what was accepted on 2011-08-17: norbertlindenberg.com/ecmascript/internationalization-formats.html

Is there any reason why it should be?

· Section 13.3.2 Intl.DateTimeFormat.prototype.format o Step 5 & 6 : NumberFormat is constructed without any options. Since this will construct the NF object with UseGrouping ON by default the formatted dates will have grouping shown for years which look odd. For e.g. we would see years like 2,001 instead of 2001. This should specify UseGrouping OFF.

That's indeed a bug. Thanks for finding it!

# Eric Albright (13 years ago)

Perhaps we need to differentiate between the ultimate fallback use of DefaultLocale (which I agree with you should be a single entry) and the creation of a localelist with no parameters (which currently calls DefaultLocale but I believe implementations should have the option of providing an actual list)

On the question of resolved locale, I guess it is both, I thought it didn't reflect what we discussed (though it seems my memory was fuzzy here). I believe our intent was that people would not have to parse the language tag and is why we specifically break out the options. If I want to know if my request was successful, I have to parse the language tag. Maybe we should expose dataLocale publically, or maybe it isn't a big deal for v1?

On the grouping rule, I was thinking that we had some languages that did not group, by default; I checked over our data again and we don't have any and there isn't a way for us to do that (except to remove the grouping rule all together which will make it not group no matter what the grouping setting is) so we're good there. Leave it as is.

# Phillips, Addison (13 years ago)

A reply to Norbert's reply and a fresh comment.

The fresh comment first:

It would be useful to add addLikelySubtags/removeLikelySubtags from UTS #35 C.10 (www.unicode.org/reports/tr35/#Likely_Subtags), which is helpful when preparing locale lists for use with legacy language tags (e.g. zh-CN vs. zh-Hans-CN).

Section 6.2.4 DefaultLocale o We would like to be able to return a priority list here and not just a single item. While many implementations may have just a single item, those that are built on top of Windows 8 will have available a list of languages that the user has declared to understand instead of just a default user locale that we have had up until now. Since this is internal, I believe changing the name to DefaultLocales and returning an array will allow implementations to do either:

The DefaultLocales abstract operation returns an array of string values representing a priority list of the structurally valid (6.2.2) and canonicalized (6.2.3) BCP 47 language tags for the host environment in descending order of priority. This may return a single item representing the current locale.

To me the default locale of an implementation and the user's preferred languages are two different things. The default locale must be supported and a single locale so that we can fall back to it when none of the requested locales are supported. The user's preferred languages can be several, and they may include languages that an implementation doesn't support.

Often the scripter needs to know what the user-agent is requesting in Accept-Language (et al). So I can see providing DefaultLocales as requested, as long as the actual host system default can be identified.

Addison

Addison Phillips Globalization Architect (Lab126) Chair (W3C I18N WG)

Internationalization is not a feature. It is an architecture.

# Norbert Lindenberg (13 years ago)

I fully agree that an API to obtain the locale list that a browser would send to the server directly from the browser would be very useful. But since that API would be browser specific, I don't think it belongs into an ECMAScript specification. Something like window.navigator.acceptLanguage might be the right API...

Are you proposing addLikelySubtags/removeLikelySubtags as new API, or as part of canonicalization?

Norbert

# Phillips, Addison (13 years ago)

I fully agree that an API to obtain the locale list that a browser would send to the server directly from the browser would be very useful. But since that API would be browser specific, I don't think it belongs into an ECMAScript specification. Something like window.navigator.acceptLanguage might be the right API...

Why would it be browser specific? Language priority lists are well-understood and the other APIs (such as the matching API) are in terms of a LocaleList. It would be permissible for an implementation to return a list with exactly one item in it, of course.

I guess the question here is "what does 'default locale' mean in the ES context?" My first reaction was similar to what you have in the spec: it's a well-defined, specific locale. But thinking about it made it seem less clear to me, especially once I started to think about the interplay between client-side and server-side processing, noting again that implementations can return a list-of-one-item. It is a good thing for Accept-Language and DefaultLocale to be somewhat holistic. Of course a browser might not return the same thing as Accept-Language as the default locale: it would not have to be required. In Eric's case, for example, it might be the local system language priority list.

Are you proposing addLikelySubtags/removeLikelySubtags as new API, or as part of canonicalization?

Putting it into canonicalization might be too strong. In our implementation, the use of these methods is actually hidden inside the matcher (so that "zh-Hans-CN" always finds "zh-CN"---or vice versa---depending on which is available), but the tags that you put in are the tags that come out.

For example, if I do:

LookupMatcher m = new LookupMatcher("zh-CN,de,fr"); 
value = m.match("zh-Hans-CN"); // value is "zh-CN" because that's what's in the list

Addison

# Norbert Lindenberg (13 years ago)

How would Node.js determine the language priority list to return (a) while processing an HTTP request, (b) while not? Note that so far ECMAScript knows nothing about HTTP...

For likely subtags, there's a note in 10.1 trying to address this situation.

Norbert

# Phillips, Addison (13 years ago)

How would Node.js determine the language priority list to return (a) while processing an HTTP request, (b) while not? Note that so far ECMAScript knows nothing about HTTP...

ES shouldn't care about HTTP. I'm just saying that implementations (which know something about both) might wish to harmonize the values they emit for each (it would make sense). Implementations still have to decide if they should follow the system configuration (usual) or local user preferences (more rarely).

Although, in the past, I have generally favored single tags over lists, in this case I'm not sure that a single tag makes sense here simply because many of the APIs that might use the default take a LocaleList.

Addison