Notes on Translating GNOME
Being a translator with the bengali localisation team and volunteer for GNOME since v.2.2, Runa Bhattacharjee shares her experience gathered in the past 2 and 1/2 years of work about technicalities and nuances that develop while doing such translation jobs.
There was a very special reason why I sat down to watch the opening ceremony of the 2004 Summer Olympics. At that time I was translating a list of country names and found it difficult to ensure the correct translation of quite a few lesser-known countries. As each contingent marched smartly past the television screen, I hurriedly jotted down the names from the accompanying announcements.
I started translation of localized interfaces as a hobby but soon realized the serious nuances that influence the way the user-interface would look and its importance from the end-user’s view. In this article, I attempt to put across a brief account of the many aspects that have gone on to determine the way the translation process has transformed from the time I started about 2 years back.
As more and more computer users are shifting base to Linux, localized interfaces are appealing to the non-english speaking users to shed their fears and explore the world of information technology. With expected growth and proliferation in the number of users, translation for localization desktops just turned serious. Besides the mandatory spelling and grammatical corrections there were a whole set of checks that the translator had to perform and one needed more than sheer instincts.
Translators generally start working with what is known as the ‘Portable Object Template’ (.pot) file containing all the user-interface messages from the application. The messages are extracted from the application’s code files with the help of a function named gettext(). The .pot files are available for download from the project repository (for e.g. the GNOME cvs). The translators download the .pot files, translate the messages, save the file with a ”.po” extension and submit it back to the repository with the necessary Changelog entries.
The typical format for the messages within the .po file is :
The ‘msgid’ is the original text from the user interface which acts as the identifier and ‘msgstr’ is the translated version of the message. The top part contains comments that are sometimes used by the translators to figure out the context of the string to translate.
During initial phases the general tendency on the part of the translators (lets assume someone without any prior experience in U-I Translation) is to apply “literal” translation methods. This often leads to incomprehensible text. One of the main reasons for such semantic discrepancy is the change that regular english has undergone within the ambit of information-technology parlance. For instance, hardly any corner of this world is devoid of the not-so-often-liked rodent “mouse” and nearly all languages have a local name for it. Yet substituting the name of the hardware with the regular local name may lead to confusion as often it is seen that the original term is linked spontaneously when used in the context of the computer apparatus. Though quite a few languages, for e.g. tamil have coined their own terms to standardize the translation and leave no chance for confusion.
When I first started translating the GNOME-UI for bengali, the standard practice was to address users-to-computer messages with a low degree of salutation whereas computer-to-user messages were more formal in tone. A friend who happened to take a casual look at the translations shook his head rather thoughtfully. What he proceeded to explain was that all messages are for the user’s perusal, hence stick to one particular format of salutation. Here it is to be explained that bengali as in quite a few other Indian languages, has 3 levels of salutation:informal, semi-formal and formal. At that time, I had vehemently objected, but later, on the user-interface, I found the discrepancies too stark to ignore. One could see the user being addressed differently at times thus giving a comic see-saw effect to the level of interaction.
Testing on the user-interface both for context as well as for the final localized interface is a handy tool. The widgets like buttons, labels etc., created to fit the original text, might get disrupted in case the translated text does not fit into them. This leads to stretched out or ill-fitting dialog boxes and windows. As the GNOME Human Interface Guidelines 2.0 mentions, “at times translated text can expand upto 30% more for some languages”.
The keywords while translating a user-interface are “comprehensibility” and “consistency”. It has to be satisfactorily comprehensible such that it facilitates the user to move around the interface with ease and choose the correct options to perform the desired actions. This is where the translators have to move away from literal translation, retain indentifiable technical keywords, coin new terms if necessary and maintain them across translations. To ensure that the user-interface is not completely unfamiliar to the users, I personally borrowed heavily from the other frequently used electronic media like telephone and other multi-lingual public-service announcements. To facilitate translations, compendiums and glossaries are taking shape as standardized translations are being settled upon and accepted within the language community.
Access keys, an intergral part of the U-I give rise to yet another dilemma. This is an area where perhaps the experiments ought to be kept at a minimum. Languages which use multiple alphabet conjuncts may face serious problems while choosing access keys. Secondly, the localized access keys would risk duplication within the same level of usage unless tested on the user-interface, rendering them completely useless. For languages using non-Romanized script it is generally considered prudent to retain the orignal access keys from the english interface.
The figure shows duplicated access keys in the same usage level.
Translators are often faced with several hurdles, one of which is in the form of “variables”, which would accept values based upon usage. For e.g. consider the text “Print %d pages” where %d would be the number of pages to be printed. Looks easy? Well, try this, “Print page %d of %d”, where the 1st %d is the page number to be printed and the 2nd %d is the total number of pages to be selected from. The translator has to ensure that the sequence of occurence of the %d ’s in the translated text does not override the context. So if in a language the normally translated text would require the sequence of the %d’s to be changed (mine does) then its time for some gymnastics. One has to bend and twist the language such that the sequence is never changed as otherwise it would result in wrong values to be populated in the variables when presented to the user. e.g. “Print page 10 of 2”.
As the number of international applications rises the requirement for fast and standardized translations have seen the introduction of automatic translation tools and/or scripts. These tools mostly use previously translated text or compendiums. Although they do take some load off the translators, context-checks are required to fine tune the generated results. Consider the string: “Open File”, this might be used to indicate “Open (a) File” or ” (a) file that is open”. In such a case the automatic translator would either use the only available option irrespective of its accuracy or ignore it if it finds more than one existing translation to choose from.
As Opensource applications developed by community volunteers are maturing into viable replacements for legacy products, one feels the need to introduce a higher degree of professional touch to the applications. Leaving registered brand names and trademarks untouched during translation, working within release deadlines, developing glossaries and ready reckoners, bug-reporting, translation/file tracking within the community are some of the practices that might be helpful in streamlining the process. The greatest challenge that I have been confronted with is the constant need to rediscover ‘more’ appropriate terminology and ensure accuracy of context. With every new release the translated content has to be updated and if necessary, fine-tuned. As mentioned earlier the user-interface needs to be checked to ensure the correct content. This is of prime importance as a localized interface targets new users in a big way, for whom the english interface is the major hindrance. These new users would be unfamiliar with the interface who will have to find their way around with the help of the the text as provided by the translators. Compendiums, glossaries and other translation tools are necessary to strike a balance between automated work processes and manual rehashing while achieving standardized and easily identifiable interfaces. It is often felt that translators need a high degree of technical understanding of the content. What I personally felt was the need to perceive the end-user’s comprehensible perspective while being familiar with the content. Translation is an ongoing process while also being a learning experience for the translators, as at the end of the day they become familiar with quite a few applications that they would have otherwise not come across during the normal course of work.