July 30, 2005
As of today, I’m no longer at Sun. After eight years of working on Java internationalization I feel it’s time to take a break and look at the world from a different angle.
Working here has been a long and interesting ride. When I joined Sun, the Java programming language already based all text handling on Unicode (back then thought of as a 16-bit character encoding), and JDK 1.1 had shipped with new internationalization APIs that promised to make it easier than on any other platform to develop multilingual applications. However, not everything worked as advertised, so it was a time of frantic bug fixing. At the same time, we had to find a solution for the famous RFE 4040458 and provide input method support for lightweight components such as those of the emerging Swing toolkit – not easy because for the 1.1.x line we had to provide a solution without adding any API. The full-fledged input method client API showed up in J2SE 1.2, and the input method engine SPI finally in J2SE 1.3.
This SPI started the process of opening up the Java internationalization architecture
so that third parties can extend the locale support that Sun provides. The
charset SPI followed in J2SE 1.4, and the locale sensitive services SPI and
ResourceBundle.Control class in Java SE 6. As I wrote
earlier, the missing pieces are font rendering and user interface localization.
The set of supported writing systems expanded from European and East Asian
writing systems in JDK 1.1 via Arabic and Hebrew in J2SE 1.3 to Thai and Devanagari
in J2SE 1.4. The
Currency class, which would have been really
useful for the earlier transition to the euro, was added in J2SE 1.4. In J2SE
5.0, we had to implement supplementary
character support because Unicode had really moved beyond the limits of
a 16-bit character encoding. In the same release, multilingual
font rendering enabled multilingual applications on the client side.
Localization made a big jump forward in J2SE 1.3.1: Before that, different sets of languages were localized for the different platforms, and for some platforms separate localized versions were produced months after the English release. Sometimes we couldn’t ship the localized versions of release n at all because they were finished only after the English version of release n+1 had shipped. I thought that’s nuts, so starting with J2SE 1.3.1 there’s been a single download bundle for each platform that includes all available localizations, which also means they all ship simultaneously.
Outside of Java SE, I worked with the Java EE web tier team to correct problems in the servlet and JSP APIs, which, like so many new technologies, had initially been developed without consideration for internationalization and then retrofitted over several releases. In the Servlet 2.4 and JSP 2.0 specifications we finally made it possible to reliably specify character encodings for all text input and output, as is necessary for multilingual web applications. I also worked with a few other internationalization experts at Sun on generic internationalization requirements.
What still hurts after all these years are mistakes that we can’t correct
because of compatibility risks or where the fix would mean complete replacement.
Counting months from 0 was a silly idea; merging two different representations
of a point in time and the algorithms to map between them into a single
a serious design mistake. The use of apostrophes to quote uninterpreted sections
MessageFormat pattern string still complicates localization
processes, especially for French and Italian. And using ISO 8859-1 as the only
supported encoding of properties files and as the default encoding in servlets
and JSP pages needlessly complicates application development – UTF-8 would
have been the right choice. At least these specifications have the excuse that
they were written when UTF-8 was not widely supported yet. There’s no such
excuse for Sun’s new bug tracking
system, released in 2004.
Over the years, my team has had ten engineers and seven managers. The number of managers is comparatively high because they don’t last as long as engineers and because for a while HR rules required us to have separate managers for our sites in the US and in Japan. Two engineers also served as managers, and two engineers moved from Japan to the US. John joined the team twice and recently rejoined Sun as a technical writer. Masayoshi started before me and stays after me.
But Java internationalization has always been a collaborative effort: The
original internationalization APIs in
largely designed by the internationalization team at Taligent, and that team
(now part of IBM) has contributed code to every major release since then; the
input method framework was designed in collaboration with Omron, Justsystem,
and Apple before there was a Java Community Process; the support for supplementary
characters was designed by the JSR 204 expert group; the charset API was part
of JSR 51 and implemented by the Libraries team; many other Java SE and Java
EE engineering teams do the necessary work on their components; the quality
assurance teams help find the bugs; and Sun’s central Globalization team helped
with early implementation efforts. Geographically, we’ve had contributors in
Silicon Valley, San Francisco, Tokyo, Yokohama, Kyoto, Tokushima, Beijing,
Bangkok, Bangalore, Jerusalem, Cairo, Prague, Dublin, New York City, Ventura,
and some whose physical location I never found out. Thank you to all of you.