By Arjan Molenaar (2022-01-18)
For desktop applications and websites it’s good practice to allow users to use them in their native language. For Open Source applications everyone can contribute with translations for your project.
After a release we found out, by a user, that Gaphor would not start, but only if it was used with a particular language. One of the translations had an error. “An error?”, I hear you ask, “How is that possible?”
First some background.
A common tool for translations is
Gettext. Translations are maintained
in .po
files, portable object files. Those files typically contain the
original text and a translation.
msgid "I’m a message"
msgstr "Ik ben een bericht"
The msgid
contains the original text. For most applications that’s the text
in American English. The msgstr
contains the translated text. If msgstr
contains no translation (it’s an empty string ""
), the original text is
used.
Translated text is not only just text. Sometimes this text contains placeholders that have to be filled in by the application. Think of things like version numbers, or a file name. We learned the hard way that an error is easily made.
The way placeholders are formatted depends on the language. In C, texts contain
%
-mark expressions (I counted %d items
). Javascript uses a different
format: I counted ${count} items
1. In Python the
C-style can be used (old style) as well as curly-bracket placeholders with the
str.format()
method: I counted {count} items
.
Now assume the text I counted {count} items
has to be translated. The term
count
is case-sensitive. For example, a slight error in the Dutch translation
could cause an unintended error in the application: Ik telde {Count} items
(with capital C).
To avoid errors we created a small
script, utilizing
Babel. Babel is a Python based internationalization
library. Using babel, we read the translations from a .po
file and check if
all placeholders from the original text ({count}
) are in the translated text.
A simple check that ensures placeholders in translated text can
always be filled in. This script can also check for other conventions you want
to uphold in your translatable text. For example, you may not want to use empty
placeholders {}
or explicitly check for %
-based placeholders. In our case
we also check for HTML entities (e.g. <).
Adding these tests will help us ensure that a future mistake in a translation file that could prevent Gaphor from launching, will be caught automatically.