evan_tech

Previous Entry Share Next Entry
06:47 am, 21 Feb 05

i18n plurals

I was just reading through a mail I linked to earlier and noticed this:

> +  if (n_uris > 1) {
> +   prompt = _("Download the links?");
> +   detail = _("You can download the links or create bookmarks.");
> +  } else {
> +   prompt = _("Download the link?");
> +   detail = _("You can download the link or create a bookmark.");
> +  }


For those unfamiliar with it: the _() macro is used to mark a string for translation. This is the best people can do right now but it's not right. But help is on the way: glib 2.6 now requires ngettext. (Previous versions didn't because it wasn't commonly available yet -- a quick test here indicates it's still not in FreeBSD's libc, but FreeBSD's i18n is reportedly poor in general.)

ngettext takes three arguments: a singular, a plural, and a number, and gives you back the appropriate translated string. This at least handles French, where zero is handled differently than in English: you write "2 foos, 1 foo, 0 foo". It's more general than that, too: gettext contains language-specific rules that define when to use which string for each language.

At first glance it looks like this wouldn't handle languages like Arabic or (apparently) Gaelige, where they have singular, dual, and plural, but it actually does. Passing multiple strings to a function like ngettext is just a programmer convenience that allows you to write the code and English version of the strings all in the same place; the underlying library only needs one string to index into the translation database and then it uses the number for the rest.

The ngettext manual goes into more detail [though Arabic is notably absent!] and includes some of the crazier rules. For example, Slovak:
Plural-Forms: nplurals=3; plural=(n==1) ? 1 : (n>=2 && n<=4) ? 2 : 0;

It's also worth noting that Japanese and Chinese don't need any of these rules because they're far more sensible about these sorts of things -- no articles, no plurals, and no cases (though that also applies to English nouns). But this is another post.

And an obligatory LJ-note: I think LiveJournal's translation system handles this properly, though awareness of this is poor. Most Russian journals I can find are in English, but after some clicking around I found kwa_kwa, where you can observe the Russian rules in action: "51 комментарий" vs "11 комментариев". I think "LiveJournal handles [x] but awareness of it is poor" is a pretty good summary of LiveJournal in general.