Table Of Contents

Previous topic

Connecting Python and C, using multiples cores

Next topic

An upgrade safety net with the logical volume manager

This Page

One application, multiple languages

I have investigated the standard internationalization (i18n) tools, because some day users may request to get messages in the language of their choice. These first steps are recorded here, first paragraph is to get a rough mental representation of how it works, second paragraph: how to build such an app.

How does it work at runtime?

Here is a super simple internationalized application in Python, called

import gettext
_ = gettext.gettext

gettext.bindtextdomain( "myi18n", "language.d")

print _("Yozzaaa")

These sources uses the following conventional directory layout:

`-- language.d
    |-- fr
    |   `-- LC_MESSAGES
    |       `--
    `-- it
        `-- LC_MESSAGES

In the sample application, the bindtextdomain is the most interesting function. It takes two arguments: domain and language_dir:

  1. the domain, (myi18n in our example) is the name of a file containing a compilation of the translated messages, and is located in a directory $language_dir/$LOCALE/LC_MESSAGES. The file always ends with .mo but do not write the suffixe in the parameter,

  2. the language directory (language.d in our example) which defines where to find all the fr/, en/,... directories which contains a mandatory LC_MESSAGES which contains the mo files i.e like that

    <language directory>
        |-- <language1>
        |   `-- LC_MESSAGES
        |       `-- <domain>.mo
        `-- <language2>
            `-- ...

The gettext system expects:

  1. to be informed at runtime of the desired langage (exporting the environment variable LANGUAGE is a common way to do it),
  2. in the source: which file to look for (a domain, and a langage directory),
  3. .mo file in path_to_language_dir/$LANGUAGE/LC_MESSAGES/

How to build the application?

The code presented here is meant to be copied and pasted in a terminal, at the end, you get a working app.

  1. When writing code, just use the _() shortcut around any strings that you later want to translate even you do not translate the application right away: it is not harmful anyway

    cat > << EOF
    import gettext
    _ = gettext.gettext
    gettext.bindtextdomain( "myi18n", "language.d")
    print _( "Hello world" )
  2. Use pygettext on the source file to create a messages.pot template translation file, which lists every messages found in the sources

    sed -i s/CHARSET/UTF-8/ messages.pot

    .pot is for an empty template translation file; .po are for translation source files; .mo are for compiled translation files

  3. Copy and rename the template file to a <language>.po file. One for every language you want to support. Example: fr.po, en.po. Fill the template with the message translations

    sub_below () { sed -r "h; N; s/^(.*$1.*\n).*/\1$2/; P; D" $3  ; }
    sub_below Hello 'msgstr "Ciao tutti"'   messages.pot > it.po
    sub_below Hello 'msgstr "Héllo à tous"' messages.pot > fr.po
  4. Create the directories where the application expects the message translation

    mkdir -p language.d/{fr,it}/LC_MESSAGES/
  5. Transform the .po in .mo with the msgfmt command, in the lang_dir/fr/LC_MESSAGES/<domain>.mo

    msgfmt fr.po -o language.d/fr/LC_MESSAGES/
    msgfmt it.po -o language.d/it/LC_MESSAGES/
    rm {fr,it}.po messages.pot
  6. Test the app with the different language

    for i in en fr it; do
        export LANGUAGE=$i ; python; done
    # Obviously, accents are correctly handled, it is made with python :)
    unset LANGUAGE

Next time, you’ll see how to use conversion specifiers in the message strings, as in printf. Also, there are similarities between i18n and audience (audience as in support, dev, admin, grandma), I would like to see how to hack i18n to address different audience.

15 November 2009