Norbert’s Corner


July 30, 2005

As of today, I’m no longer at Sun. After eight years of working on Java internationalization I feel it’s time to take a break and look at the world from a different angle.

Working here has been a long and interesting ride. When I joined Sun, the Java programming language already based all text handling on Unicode (back then thought of as a 16-bit character encoding), and JDK 1.1 had shipped with new internationalization APIs that promised to make it easier than on any other platform to develop multilingual applications. However, not everything worked as advertised, so it was a time of frantic bug fixing. At the same time, we had to find a solution for the famous RFE 4040458 and provide input method support for lightweight components such as those of the emerging Swing toolkit – not easy because for the 1.1.x line we had to provide a solution without adding any API. The full-fledged input method client API showed up in J2SE 1.2, and the input method engine SPI finally in J2SE 1.3.

This SPI started the process of opening up the Java internationalization architecture so that third parties can extend the locale support that Sun provides. The charset SPI followed in J2SE 1.4, and the locale sensitive services SPI and the ResourceBundle.Control class in Java SE 6. As I wrote earlier, the missing pieces are font rendering and user interface localization.

The set of supported writing systems expanded from European and East Asian writing systems in JDK 1.1 via Arabic and Hebrew in J2SE 1.3 to Thai and Devanagari in J2SE 1.4. The Currency class, which would have been really useful for the earlier transition to the euro, was added in J2SE 1.4. In J2SE 5.0, we had to implement supplementary character support because Unicode had really moved beyond the limits of a 16-bit character encoding. In the same release, multilingual font rendering enabled multilingual applications on the client side.

Localization made a big jump forward in J2SE 1.3.1: Before that, different sets of languages were localized for the different platforms, and for some platforms separate localized versions were produced months after the English release. Sometimes we couldn’t ship the localized versions of release n at all because they were finished only after the English version of release n+1 had shipped. I thought that’s nuts, so starting with J2SE 1.3.1 there’s been a single download bundle for each platform that includes all available localizations, which also means they all ship simultaneously.

Outside of Java SE, I worked with the Java EE web tier team to correct problems in the servlet and JSP APIs, which, like so many new technologies, had initially been developed without consideration for internationalization and then retrofitted over several releases. In the Servlet 2.4 and JSP 2.0 specifications we finally made it possible to reliably specify character encodings for all text input and output, as is necessary for multilingual web applications. I also worked with a few other internationalization experts at Sun on generic internationalization requirements.

What still hurts after all these years are mistakes that we can’t correct because of compatibility risks or where the fix would mean complete replacement. Counting months from 0 was a silly idea; merging two different representations of a point in time and the algorithms to map between them into a single Calendar class a serious design mistake. The use of apostrophes to quote uninterpreted sections of a MessageFormat pattern string still complicates localization processes, especially for French and Italian. And using ISO 8859-1 as the only supported encoding of properties files and as the default encoding in servlets and JSP pages needlessly complicates application development – UTF-8 would have been the right choice. At least these specifications have the excuse that they were written when UTF-8 was not widely supported yet. There’s no such excuse for Sun’s new bug tracking system, released in 2004.

Over the years, my team has had ten engineers and seven managers. The number of managers is comparatively high because they don’t last as long as engineers and because for a while HR rules required us to have separate managers for our sites in the US and in Japan. Two engineers also served as managers, and two engineers moved from Japan to the US. John joined the team twice and recently rejoined Sun as a technical writer. Masayoshi started before me and stays after me.

But Java internationalization has always been a collaborative effort: The original internationalization APIs in java.text and java.util were largely designed by the internationalization team at Taligent, and that team (now part of IBM) has contributed code to every major release since then; the input method framework was designed in collaboration with Omron, Justsystem, and Apple before there was a Java Community Process; the support for supplementary characters was designed by the JSR 204 expert group; the charset API was part of JSR 51 and implemented by the Libraries team; many other Java SE and Java EE engineering teams do the necessary work on their components; the quality assurance teams help find the bugs; and Sun’s central Globalization team helped with early implementation efforts. Geographically, we’ve had contributors in Silicon Valley, San Francisco, Tokyo, Yokohama, Kyoto, Tokushima, Beijing, Bangkok, Bangalore, Jerusalem, Cairo, Prague, Dublin, New York City, Ventura, and some whose physical location I never found out. Thank you to all of you.