LocaleInfo implementation details and best match algorithm
Richard, see my comments inline.
I finally had a chance to look at your proposal, and I'm having some trouble following it. It sort of winds up sounding like there are four different values-- localeID, regionID, options.localeID, and options.regionID, and they all mean different things. It seems like this might make sense if the "options" stuff were output parameters, but the text makes it all sound like they're input parameters. This seems extremely confusing.
I agree. Input parameter to LocaleInfo constructor is called options. LocaleInfo also contains options key that holds resolved input values (like actual localeID, canonicalized regionID). I will rename input parameters to settings to clarify things in strawman.
The other thing I find interesting is that the caller can supply an ARRAY of language tags-- a language preference list, essentially. I work in an environment where there's always one locale, so I might be biased, but I'm wondering whether we really need all this functionality now, or if we can have everything take one language tag now and work up to taking multiple tags in a future version.
I think once you manage to match one localeID from input with available locales on the system, extending the problem to matching priority list is fairly simple. I feel single localeID input is going to be prevalent way of specifying which locale you want, but I think the vision here was to enable something like acceptLanguage in HTTP protocol where user can have a say in how fall back should work. I'll let Shawn comment more here.
I've also never been a fan of specifying the country separately from the country in the language. I understand "de-AT" just means "The Austrian dialect of German," not "Austrian German, and the country is Austria," but I'm wondering again if the need the extra flexibility of specifying a region code for the language and a SEPARATE country code just for things like currency formatting, especially with a bunch of extra machinery for guessing the country code if the user doesn't supply one. This seems like way more machinery than the vast majority of users will ever need, and I'm wondering if we can blow it off or defer it to some future iteration.
There was a long discussion about this item and we decided to go with more strict definition (specifying both localeID and regionID). Users are of course free to use only localeID, but some results may be off in case localeID doesn't match the expected region for the user (e.g. currency for now, maybe more in the future). Without regionID there is no way to specify that preferred region for en-CA user may be UK...
I agree that current usability is pretty low - defining currency only, but I can't predict all future uses.
For that matter, if the only thing that's driven by the country code is the currency unit, why not just specify the currency unit? What other stuff do we expect to be driven off the country code in the future?
I was hoping to move currencyCode/Symbol to NumberFormat constructor.
Region code could influence measurement units, default paper size or something else in the future.
Region info could also help with things outside of the API, say default search engine domain (is it google.com or google.rs...) or bookstore (amazon for western people, something else for others)
For whatever it's worth...
Please, keep it coming :).
I’ve started implementation of the LocaleInfo class in Chrome and I would like to clarify what the actual parameters are and how do we construct the object given those parameters.
Differences to the current proposal are (for sake of simplification, but without any loss in clarity or functionality):
LocaleInfo constructor takes options parameter with two fields as input.
specification with addition of invalid, undefined and reserved region codes).
I propose algorithm pastebin.mozilla.org/1198734 for resolving best
match locale id and region id from inputs. We should discuss if actual distance computation for best match should be left to implementers or if we should standardize it (data it relies on may be different).
As the product of the algorithm LocaleInfo object will have:
Examples - actual values may vary among implementations because of data variation. Implementers can also decide to pick different most likely locale (say en-GB instead of en-US) based on their target region... localeIDregionIDoptions.localeIDoptions.regionID--defaultdefault’s regionfr- frFRfr-CA-fr-CACAfrBEfr-BEBEfrRUfr (fr-RU is not available)RU[‘es’, ‘es-419’]-esES[‘pt’, ‘pt-BR’] PTpt-PTPTsrZZsr (didn’t match sr-ZZ, best match sr).ZZde-Latn-DE-u-co-phonebkATde-DE-u-co-phonebk (best match de-DE, and added extension)ATsr-MNBAsr-RS (didn’t match sr-MN, best match was sr-RS)BA
** - Implementation are free to pick the best default value for their platform. One possible default could be root locale.