Regex has not been part of scope of the Globalization API work. I wanted to
find out whether any improvements from an internationalization point of
view are being planned, separately.
Some of the problems include:
Regex's fail on supplementary characters (above U+FFFF). Most of these
are rather low frequency, but there are a large number of Chinese
characters, some used in people's names or place names.
The Unicode support is otherwise extremely limited, especially for
properties. See 98.245.80.27/tcpc/OSCON2011/gbu.html for a
comparison to other programming languages. The downside of this is that it
promotes hard-coded lists because people "think" they know what characters
occur in words, etc., but get it wrong.
Regex has not been part of scope of the Globalization API work. I wanted to
find out whether any improvements from an internationalization point of
view are being planned, separately.
Some of the problems include:
- Regex's fail on supplementary characters (above U+FFFF). Most of these
are rather low frequency, but there are a large number of Chinese
characters, some used in people's names or place names.
- This also impacts the result of validation in HTML5, such as in
http://dev.w3.org/html5/spec/Overview.html#the-pattern-attribute
- The Unicode support is otherwise extremely limited, especially for
properties. See http://98.245.80.27/tcpc/OSCON2011/gbu.html for a
comparison to other programming languages. The downside of this is that it
promotes hard-coded lists because people "think" they know what characters
occur in words, etc., but get it wrong.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20111117/18bce1f6/attachment.html>
Regex has not been part of scope of the Globalization API work. I wanted to find out whether any improvements from an internationalization point of view are being planned, separately.
Some of the problems include: