Chapter 12. Conquer the world with l10n and i18n
In this chapter:
- Supporting multiple languages with your web application
- Is that the second of May or the fifth of February?
As part of the previous chapter, you learned how to restrict access and optionally render or hide components depending on the user’s session. In this chapter, we’ll look at how you can vary what is displayed to users depending on their locale. A locale represents a geographical, political, and/or cultural region. In computing, it usually groups a set of parameters that represent the user’s language and country. In Java, this is supported through the Locale object.
Localization refers to the adaptation of your application for one or more specific locales. A related term is internationalization, which encompasses all techniques that enable applications to be localized: being able to conveniently maintain different languages, handling different date and number formats, using the proper encoding type, and so forth. For the sake of simplicity, we’ll talk only about localization in this chapter, even if it sometimes would be more precise to talk about internationalization.
Note
Localizing components and applications can involve a large range of items. Typically, the most important are as follows:
- Alphabets and scripts—The ASCII character set is fine when you work with the English language; but if you need to work with Chinese, Russian, or Thai, ASCII won’t cut it. Unicode is a widely supported encoding scheme that enables you to deal with a large variety of alphabets and scripts. Java standardized support for it, and we take full advantage of these built-in capabilities.
- Formats—Different locales typically use different formats for dates/times and numbers. For instance, the first of February is written 2/1 in the US, but in The Netherlands it’s written 1-2.
In addition, you need to consider a number of issues from locale to locale, such as the patterns of bank-account numbers; government-assigned numbers like Social Security numbers and postal codes; and things like calendars (Gregorian or Buddhist), weights, measures, currencies, and so on. And we haven’t even scratched the surface, when you consider cultural differences in the meaning of colors and numbers and other locally sensitive considerations and customs.
Wicket’s support for localization can be summed up in the following points:
- Locale-aware support for conversions from things like numbers and dates in Java to text, and back again. You can configure converters globally or per component.
- Locale-aware markup loading. By following a simple naming pattern, Wicket automatically uses the correct locale-specific markup files. This is extended beyond locales to let you implement variations within locales.
- Extended resource-bundle support. On top of what Java supports through resource bundles, including the new XML format for property bundles, Wicket has a powerful lookup system for messages that, among other things, takes the class hierarchy and runtime component hierarchy into account. It also supports easy-to-use parameter substitutions.
- Special tags for localizing text on your pages without the need to explicitly mirror them with Wicket Java components.
- A range of components, models, and utility classes—such as the Localizer class—to make creating localized web applications a breeze.
- A message-replacement mechanism that fails fast when your application runs in development mode but is lenient when it runs in production mode. You’ll find bugs quickly when you’re developing; but if you miss any, your clients won’t see error pages.
We’ll touch on most of these items when we apply them in practice with our ongoing example.
Note
We’ll start with how Wicket supports developing for multiple languages.
Probably the most important capability you need when localizing applications is the ability to display pages in different languages. In this section, we’ll develop English, Dutch, and Thai versions of the discounts list we developed in the previous chapter. As a teaser, here are screenshots of the Dutch and Thai versions—you saw the English version in chapter 11. Figure 12.1 shows the Dutch version.
And figure 12.2 shows the Thai version.
At right on these screens is a drop-down menu that displays the current locale. It lists the available languages (English, Dutch, and Thai) in the language of the currently selected locale. Notice that this is the component we developed in chapter 8.
When a user selects a locale from that drop-down menu, that locale is set as the current one for the session. This is recognized by Wicket, and it automatically loads all the proper markup and messages automatically.
Instead of putting all the text directly in HTML files, as we’ve done so far, we’ll put the locale-dependent text in separate files. It’s much easier to maintain that way (you have all the locale-dependent text together instead of scattered throughout your markup), and this approach enables you to let a third party do the translations.
We’ll take the user panel as an example. Right now, the markup of the user panel is as shown in listing 12.1.
Listing 12.1. UserPanel.html without localization
123456
<wicket:panel>
Signed in as
<i><span wicket:id="fullname">[name]</span></i>
<select wicket:id="localeSelect" />
<a wicket:id="signout">signout</a>
</wicket:panel>
The first step in localizing pages and components is to identify the locale-dependent parts. In this case, we can recognize two variable parts, which are marked in figure 12.3.
The marked parts, Signed in as and signout, haven’t yet been localized. We don’t have to localize the user’s name, and we left out the language-selection drop-down because it already includes localization. You can see how in the following code fragment:

The custom ChoiceRenderer used by LocaleDropDown uses the getDisplayName method of the Locale class to display options in the currently selected locale. The getLocale method is inherited from Component, which by default calls the getLocale method of the Session object.
We can conveniently localize the user panel by replacing the marked parts with <wicket:message> tags.
Wicket’s message tags are somewhat of an exception to how Wicket usually works. Typically with Wicket, you explicitly instantiate Java components and add them to the component tree matching them to the markup tags. In this case, Java components are created implicitly when <wicket:message> tags are encountered, so all you need to do is define these tags in your markup.
In the code fragment in listing 12.2, we replace the locale-dependent parts of the user panel with <wicket:message> tags.
The <wicket:message> tags trigger Wicket to insert label components on the fly. These labels use the key attribute to look up values in resource bundles, which they then use to replace the body of the <wicket:message> tag pair.
Note
Automatically inserted components are called auto-components throughout the framework. It’s unlikely you’ll ever have to deal with them directly, unless you create custom tag handlers.
The resource-bundle mechanism employed by Wicket resembles Java’s resource-bundle mechanism, but it’s more flexible in how it can be configured and it has a more extensive search path. Resource bundles are basically a way to provide access to a collection of key/value pairs. In UserPanel.html, we have two such keys—signed_in_as and signout—and we have as many values for each as we have languages we want to support. Resource bundles in Java applications are typically implemented as Properties objects, often loaded from key/value pairs stored in text files (which typically use the .properties extension). These text files are ISO 8859-1 encoded (a popular eight-bit encoded character set also known as Latin 1) and consist of lines of key=value pairs (key:value and key value are supported as alternatives). If you have to write your application for one of the roughly 25 languages and dialects that can properly be encoded, properties files work easily. But because most of the world’s population communicates in languages that aren’t supported by this encoding, chances are you’ll end up using escaped unicode, resulting in files full of strings like \u8A9E\u8A00. Fortunately, since version 5, Java supports XML files for properties. The XML format supports any encoding that Java supports, including XML’s default UTF-8, at the cost of a more verbose notation. Instead of writing
you can write
which makes a lot more sense if you can read traditional Chinese. Wicket supports both formats, so you can choose what works best for you.
Note
You can use XML property files with Wicket 1.3 even if you use Java 1.4.
For listing 12.2, we put the messages in the file UserPanel.properties file next to the class and HTML files, so that that part of our source tree then looks like this:
UserPanel.properties then has the following contents:
When Wicket looks for messages, as it does in the user panel triggered by the message tag, it starts by trying to locate a properties file next to the closest component it can find. In this case, the closest component that has messages associated is the user panel; the message tags are nested in that panel, and none of the other parent components (note that the link with the identifier signout is a parent of a <wicket:message> tag) have messages.
The UserPanel.properties file is used when no better matches are found. In contrast, the bundles for Dutch and Thai are used only when that specific locale is the current one. Including the bundles for the Dutch and Thai locales, the package looks like this:
You can see that—as with Java’s property-based resource bundles—the locale information is part of the filename. The (partial) pattern is as follows:
In our case, the base filename is UserPanel (which is the name of the matching component class). Wicket tries to match the locale as specifically as possible. For instance, the Dutch locale for someone in The Netherlands is nl_NL, but the Dutch locale for someone in Belgium is nl_BE. Neither of these is found here, so Wicket tries to match on the language next, which is UserPanel_nl.
Wicket tries both the properties and xml extensions (xml first); it does so for all the language/variant/country combinations.
Let’s look at the contents of the Dutch and Thai message files, where the Dutch version is maintained in a regular properties file and the Thai version in the new XML format. Here’s the Dutch version:
And here’s the Thai version:
Many text editors nowadays are able to recognize XML files, and most of them will switch to the appropriate encoding for editing. They do this by interpreting the declaration (the first line of the XML file, which should always contain <?xml). In the previous listing, the encoding is declared to be UTF-8 (unicode).
Although labels are great for displaying information such as the name of the current user or the result of a calculation, here we just need only a lookup following a fixed algorithm. An advantage of <wicket:message> tags over plain labels like those in the previous snippet is that you don’t need to synchronize the Java and markup hierarchy. Having to synchronize the two is usually a minor nuisance, but with text it can become a major headache; moving pieces of text from one area on the page to another is something you’ll probably do more often than, for instance, moving forms or tables.
We already hinted that Wicket’s resource-bundle mechanism is similar to the one that Java provides out of the box, but more powerful. This is due to the way Wicket locates the resource bundles, which is the topic of the next section.
The path Wicket uses to look up message bundles (.properties or .xml files), can be defined as follows:
First, Wicket uses the entire path. If no matches are found, Wicket traverses the path from specific to generic, ending with the shorter path, where it takes into account the case with no style and locale. For example, with the style mystyle, the language nl, the country NL, and no variant, the lookup goes like this:
- name_mystyle_nl_NL.xml
- name_mystyle_nl_NL.properties
- name_nl_NL.xml
- name_nl_NL.properties
- name_nl.xml
- name_nl.properties
- name.xml
- name.properties
The name component is variable and—as we defined—equals the name of the component that currently serves as the search input. The algorithm for trying the components works as follows:
- Wicket determines the component that is the subject for the message. How this is determined depends on the component, model, or other variables. Typically it’s the component that uses the resource model. The subjects of the Wicket message tags are the auto-components that are inserted at runtime.
- Wicket determines the hierarchy the component resides in and creates a search stack for it. This equals the subject component plus all its parents up to the page level, but in reverse order. For the Wicket message tags used in the user panel, the search stack is as shown in figure 12.4.
- When the search stack is determined, Wicket works from the top of the stack down to the subject component until it finds a match. For each component in the stack, Wicket performs the variation/style/locale matching described at the start of this section.
- For the components between the page and the subject, Wickets takes the component identifiers into account as well. Declarations with the identifier of the immediate parent preceding the actual key have precedence over the plain keys.
Currently, the resources are defined in UserPanel.properties (and the language variants UserPanel_nl.properties and UserPanel_th.xml). If we add DiscountsPage.properties with the key signed_in_as, that declaration will take precedence over the ones defined on the panel. If we add userPanel.signed_in_as to that file (in the form id.key), it will take precedence.
Using message bundles like this is easy and flexible. But Wicket’s support for multiple languages doesn’t end here. In the next section, you’ll see that the magical naming trick applies to markup files as well.
The trick you just saw for resource bundles works the same for markup templates. As an alternative to separate resource bundles, you can have different markup files for each locale.
Let’s change the way we implemented UserPanel as an example. The new structure looks like this:
UserPanel.html is the English version and serves as the default. If your locale is Chinese (a locale we don’t support in this example), the English version is shown.
If we didn’t separate the locale-dependent parts from the rest of the markup, but instead relied on the localized loading of the templates, UserPanel’s markup would be as shown in listing 12.3.
Listing 12.3. UserPanel.html without <wicket:message> tags
12345678
<wicket:panel>
Signed in as
<i><span wicket:id="fullname">[name]</span></i>
<select wicket:id="localeSelect" />
<a wicket:id="signout">
signout
</a>
</wicket:panel>
The Dutch UserPanel would look like listing 12.4.
Listing 12.4. UserPanel_nl.html
12345678
<wicket:panel>
Aangemeld als
<i><span wicket:id="fullname">[name]</span></i>
<select wicket:id="localeSelect" />
<a wicket:id="signout">
afmelden
</a>
</wicket:panel>
As you can see, we don’t need the <wicket:message> tags; we use the text for the proper language directly.
But the Thai version, is shown in listing 12.5, has a catch.
Listing 12.5. UserPanel_th.html
123456789
<?xml version="1.0" encoding="UTF-8"?>
<wicket:panel>
<i><span wicket:id="fullname">[name]</span></i>
<select wicket:id="localeSelect" />
<a wicket:id="signout">
</a>
</wicket:panel>
Note that the first line is an XML declaration. The Thai language consists of characters that can’t be expressed as ASCII characters. One way to properly encode the Thai characters is to write the template in UTF-8 encoding. If you start your markup files with such a declaration, Wicket will recognize that the file should be read in as a UTF-8 stream. The declaration is optional but recommended. Because it’s outside the <wicket:panel> tags, it’s ignored for the rest of the processing, so you don’t see the declaration back in your pages.
Tip
In the last two sections, we looked at Wicket’s locale matching for resource bundles and markup files. The powerful pattern that Wicket employs is used for everything that goes through Wicket’s resource-lookup mechanism, like packaged CSS and JavaScript files, but also packaged images. For instance, if we wanted to display the flag of the current locale, we could include an image in the markup like this:
In our package, we’d have flag.gif, flag_nl.gif, and flag_th.gif. Wicket would automatically load the appropriate flag for the current locale.
Note
Instead of adding an image component in Java and an img tag with a wicket:id attribute in our markup, we embedded the tag in <wicket:link> tags. You can use <wicket:link> tags for normal links, images, JavaScript, and stylesheet declarations.
Working with separate markup files per locale/style gives you maximum flexibility, but using resource bundles with one markup file is the better choice for most people. Message bundles can be maintained separately, and they’re also more flexible in how they’re located; for example, you can include the messages at the page or application level, whereas markup must always directly match the components.
You can even mix the approaches. Both have one thing in common: they use the same mechanism to load the resources, whether they’re properties files or markup files (HTML). In the next section, we’ll leave localization for a bit and investigate how you can customize the way Wicket searches for resources.
A common question on the Wicket user list is how to deviate from Wicket’s pattern of placing Component’s markup files next to the component’s class files (this typically means you’ll put them next to your Java files, relying on the build system to place copies of the markup files into the directory the class files are written to).
Note
Here, when we talk about resources, we mean markup and resource bundles, not the request-handling resources we discussed in chapter 10. Wicket locates resources using resource-stream locators, which are abstracted in the IResourceStreamLocator interface; this interface has the default implementation ResourceStreamLocator.
The code fragment shown in listing 12.6 is an example of a custom resource-stream locator.
This class takes a directory as a constructor argument and uses it as the base for looking up resources. A typical locate request has a class argument like myapp.MyComponent and a path argument like myapp/MyComponent_en.html.
If your base directory is /home/me, then the example request resolves to /home/ me/myapp/MyComponent_en.html. In the example, we override ResourceStream’s locate method with two arguments. Note that this method is called by ResourceStreamLocator, which among other things tries to match with the most specific locale first. If the locate invocation returns null, it’s an indication that the locator should try other combinations (for instance, myapp/MyComponent.html) before giving up.
Tip
It’s highly recommended that you extend ResourceStreamLocator rather than implement the IResourceStreamLocator interface directly, and let your implementation call the appropriate superlocator method when it can’t find a resource. Components you reuse may rely on the resources being packaged with the classes. ResourceStreamLocator will fall back to loading resources relative to classes when custom loading fails.
You register the custom resource locator in your application object’s init method, as shown in listing 12.7.
Listing 12.7. Registering the custom resource-stream locator
12345678910111213141516
public class MyApplication extends WebApplication {
public MyApplication() {
}
public Class getHomePage() {
return Home.class;
}
protected void init() {
File baseDir = new File("/home/me");
IResourceStreamLocator locator =
new MyResourceStreamLocator(baseDir);
getResourceSettings().setResourceStreamLocator(locator);
}
}
Wicket has some convenient implementations. Alternatively, we could implement the previous example like this:
This uses a Path object, which in turn is an implementation of IResourceFinder, which is a delegation interface that is used by ResourceStreamLocator.
Now that you know the lookup mechanism can be customized, please heed the following warning. Wicket’s default way of locating resources enables you to quickly switch between the Java files and markup files during development because they’re right next to each other. Also, with this algorithm, your packaged components are immediately reusable without users having to configure where the templates are loaded from; if the components’ classes can be found in the class path, so can their resources. It’s a powerful default, and you may want to think twice before you implement something custom.
So far, we’ve primarily been looking at localized text output. In the last section of this chapter, we’ll discuss localized model conversions, which you use to localize values that are stored in models.
Wicket has a mechanism for handling objects that have different string representations depending on the locale. Examples of such objects are numbers and dates. The string 100, 125 is interpreted as a different number depending on the locale. Americans interpret it as one hundred thousand, one hundred and twenty-five; Dutch people interpret it as one hundred and one eighth. In the same fashion, the string 10/12 in the context of dates represents the twelfth of October for Americans and the tenth of December for Dutch people. If your application is supposed to serve different nationalities in their own ways, you must format numbers, dates, and possibly other objects differently according to the user’s locale.
The objects responsible for such conversions in Wicket are called converters.
Even if you aren’t interested in formatting numbers and dates for specific locales, you still need a mechanism to switch between strings (HTML/HTTP) and Java objects and back again. You can build conversions into your components or models. Listing 12.8 shows an example where a model takes care of the locale-dependent formatting.
Listing 12.8. Utility model that formats values of the nested model
12345678910111213141516171819202122232425262728293031323334
public class NumberFormatModel implements IModel {
private final IModel wrapped;
public NumberFormatModel(IModel numberModel) {
this.wrapped = numberModel;
}
public Object getObject() {
Number nbr = (Number) wrapped.getObject();
return nbr != null ? getFormat().format(nbr) : null;
}
public void setObject(Object object) {
try {
if (object != null) {
wrapped.setObject(getFormat().parse((String) object));
} else {
wrapped.setObject(null);
}
} catch (ParseException e) {
throw new RuntimeException(e);
}
}
private NumberFormat getFormat() {
NumberFormat fmt = NumberFormat.getNumberInstance(Session.get()
.getLocale());
return fmt;
}
public void detach() {
}
}
The disadvantage of using models for this purpose is that you must always be aware of this wrapping—forget it, and you’ll get typing errors. In addition, there is no way to be sure conversions are executed across the board.
This is why Wicket has a separate mechanism for conversions. The main interface of this mechanism is IConverter (see figure 12.5).
To illustrate how this works, let’s look at how the Label component renders its body and the process of triggering the use of a converter.
While rendering pages, Wicket asks components to render themselves to the output stream. That process is broken into a couple of steps, and onComponentTagBody is one of the methods a component uses to delegate a specific piece of work. Containers (components that can contain other components) delegate rendering to the components nested in them. But some components, like Label (which isn’t a container), provide their own implementation of this method. Here is that implementation for the Label component:
The interesting part is the call to getModelObjectAsString, which is a method of the component base class. You can see its implementation in listing 12.9 (comments are stripped).
Listing 12.9. getModelObjectAsString method of Component
12345678910111213141516
public final String getModelObjectAsString() {
final Object modelObject = getModelObject();
if (modelObject != null) {
IConverter converter = getConverter(modelObject.getClass());
final String modelString =
converter.convertToString(modelObject, getLocale());
if (modelString != null) {
if (getFlag(FLAG_ESCAPE_MODEL_STRINGS)) {
return Strings.escapeMarkup(modelString, false, true)
.toString();
}
return modelString;
}
}
return "";
}
This method gets a converter instance, which it uses to convert the model object to a string using the convertToString method. The implementation of that method uses a NumberFormat in the same fashion as the custom model we looked at earlier.
Because converters are always used for the appropriate types, we can rewrite the previous code fragment as follows:
This code fragment works for numbers, dates, and anything else for which converters are registered. Wicket’s default configuration is probably good for 95% of use cases.
But the default configuration may be insufficient at times. In the next section, we’ll look at how you can provide custom converters.
In this section, we’ll look at how you can customize conversions for individual components or an entire application. The first step a component executes when locating a converter is to call its getConverter method—and there we have the first opportunity for customization. We discussed this customization in chapter 9, when we implemented a percentage field. Let’s look at it again in a bit more detail.
You may want to use this customization when, for example, you want to deviate from the application-wide registered converter. For instance, you may want to display a date formatted with the months fully spelled out, but the converter installed on the application displays months as numbers.
Another good use case is when the conversion is an integral part of your component. An example of this is a URL text field. If you want to write a URL text field that works in all projects, you can pin down the converter (override getConverter and make it final) and return your URL converter there. That way, you guarantee that the appropriate conversion is performed, no matter how the application is configured.
The URL text field is implemented in listing 12.10.
This text field overrides any globally defined converter and provides its own. When it renders, convertToString is called, and the URL is returned as a string; and when values are set on the text field (typically through a user providing input), convertToObject converts the string (which comes from the HTTP request) to a proper URL object again.
There is a thin line between where it’s appropriate to use a custom model and where it’s best to use a custom converter. Not everyone on the Wicket team agrees, but we like the layering that custom converters enable. Converting from and to URLs is an obvious case for a converter, but formatting a mask is debatable. For instance, look at the code fragment in listing 12.11.
In this case, because you aren’t merely converting between types, but altering the user’s input and the output that is rendered, you may as well use an explicit model. It’s largely a matter of taste which approach you choose.
What if you want to install custom conversions for the entire application? You do so with converter locators.
A converter locator is an object that knows where to get converter instances for the appropriate types. An application has one instance, and this instance is created by the implementation of newConverterLocator, which is an overridable method of Application that is called when the application starts up.
If you want to provide a custom converter locator and configure the existing one, you can override newConverterLocator in your application. Listing 12.12 is an example that installs a URL converter for the entire application.
or—because property models introspect the target type—this should suffice:
Note that in this example, we instantiate the default ConverterLocator rather than implement the IConverterLocator interface from scratch. Doing the latter is possible, but the default converter locator is designed to be easily extended.
We used this chapter to look at different aspects of localizing your web applications with Wicket. The two main things we discussed were how to support multiple languages with <wicket:message> tags and localized markup (with a detour to explain how you can customize the way markup is loaded), and how converters work and can be customized.
We’ve used example data in several occasions in this book, but we haven’t paid much attention to where this data comes from. In the next chapter, we’ll examine how you can use Wicket to build database applications.