Image attribution

Verify translations

Da Arjan Molenaar (2022-01-18)

For desktop applications and websites it’s good practice to allow users to use them in their native language. For Open Source applications everyone can contribute with translations for your project.

After a release we found out, by a user, that Gaphor would not start, but only if it was used with a particular language. One of the translations had an error. “An error?”, I hear you ask, “How is that possible?”

First some background.

A common tool for translations is Gettext. Translations are maintained in .po files, portable object files. Those files typically contain the original text and a translation.

msgid "I’m a message"
msgstr "Ik ben een bericht"

The msgid contains the original text. For most applications that’s the text in American English. The msgstr contains the translated text. If msgstr contains no translation (it’s an empty string ""), the original text is used.

Translated text is not only just text. Sometimes this text contains placeholders that have to be filled in by the application. Think of things like version numbers, or a file name. We learned the hard way that an error is easily made.

The way placeholders are formatted depends on the language. In C, texts contain %-mark expressions (I counted %d items). Javascript uses a different format: I counted ${count} items1. In Python the C-style can be used (old style) as well as curly-bracket placeholders with the str.format() method: I counted {count} items.

Now assume the text I counted {count} items has to be translated. The term count is case-sensitive. For example, a slight error in the Dutch translation could cause an unintended error in the application: Ik telde {Count} items (with capital C).

To avoid errors we created a small script, utilizing Babel. Babel is a Python based internationalization library. Using babel, we read the translations from a .po file and check if all placeholders from the original text ({count}) are in the translated text. A simple check that ensures placeholders in translated text can always be filled in. This script can also check for other conventions you want to uphold in your translatable text. For example, you may not want to use empty placeholders {} or explicitly check for %-based placeholders. In our case we also check for HTML entities (e.g. <).

Adding these tests will help us ensure that a future mistake in a translation file that could prevent Gaphor from launching, will be caught automatically.

Notes

  1. This is actually a template string in Javascript. It takes some extra effort to make those translatable.