White Papers

Multilingual websites:

Benefits you can count on, headaches you can avoid

Robert Hopkins, Jr., President, Weblations

Copyright © Robert Hopkins, Jr., 1996 - 2002. Permission is granted to copy this article to another website, in its entirety only, provided that you acknowledge its source and provide a link to Weblations. For other uses, please contact the

Abstract

We discuss the benefits gained in multilingual websites, their problems, and solutions to these problems. The article is relevant to both business and technical readers.

What is a multilingual website?

A multilingual website contains any mixture of:

  • global content, translated to many languages for worldwide use, and
  • local content, written directly in each language for the local market.

A typical mature corporate website uses both types of material.

Examples of global editorial content include: product information, technical support documents, tutorials, corporate profiles, worldwide branding messages and the design of the web itself. This material is applicable everywhere and is relatively insensitive to national or cultural differences. It usually accounts for most of the website.

Examples of local content include: the locally available products (often a subset of the international mix), local promotions, sales and advertising campaigns and local points of purchase indices. Done correctly, the most relevant and exciting editorial material can be local content. Just a few pages of up-front material can go a long way to convincing your audience that you belong to their culture.

Since a multilingual website can be published in a dozen or more languages, the question arises:

How can I create and control an editorial product that I don't understand?

To meet this need, a new class of web agencies are appearing that specialize in the translation, reverse translation, revision maintenance and quality control of multilingual websites for corporations. They can work directly for the corporate client or for the creative agency that the client selects to design the website. In this article I would like to summarize what we've learned since we founded Weblations to serve this market in spring 1996.

Unique opportunities

Let's begin by listing the benefits of investing in a multilingual website:

  1. Your corporate message can reach any number of new potential customers without printing or distribution costs. As a communications solution, it can be scaled to any audience size and projected to any location at no extra cost.

  2. You can create this message once, and manage it from one location, for consistency in branding, content and quality. The simplicity and centrality of this process means that it costs much less money and takes much less time to create and deliver a message to your audiences.

  3. Rapidly changing content can published and revised in many languages simultaneously without the need to manage an inventory of printed material for global distribution. The information could be anything from sales material for a new product to technical support documentation for a new software release.

  4. If you already have a website in one language, then its content is already in digital form, organized for translation. It's a simple matter to deliver the files to a translation agency by email or FTP. This unique characteristic of websites reduces turnaround and logistical problems for you the client, especially on highly interactive sites.

  5. The linguistic quality of a website translation can (and must) be excellent. It works like this: since your web is highly portable, it can be translated by professionals who live and work in their mother tongue countries. These translators have the best and most current command of their own languages. Native and resident translators must be used for leading edge webs - on culture and the Web itself, for example - since much of this vocabulary is new and changing.

Unique challenges

For all of their benefits, multilingual websites also bring difficulties. Below are some common problems with suggestions for avoiding them:

Problem: Translating the text in an HTML file

Ironically, the HTML file format itself is a formidable barrier to efficient, high quality translation. Adequate tools for the job simply do not exist on the open market. Although there is a thriving market for web page creation tools, these are poorly adapted to the needs of a translator, whose job it is to replace the original text with its translation, without damaging the surrounding HTML code or the design of the original web page. Three poor choices exist:

  1. WYSIWYG applications (FrontPage®, etc.). They often break a file in the mere act of opening and closing it.

  2. Non-interpreting applications (WebEdit®, etc.). They show the raw HTML code, exposing it to accidental damage and making it very difficult to read and translate the text for linguistic flow.

  3. Translation assistants (Trados®, etc.). Some now handle basic HTML files, but they are hindered by one or both of the problems of choices 1 and 2 above.

Solution: Website translation tools

The key to efficient, high quality and rapid web translations are the tools that the translators use. For example, at Weblations, our software parses even the most complex HTML-type files, scripts and databases, isolates the text strings and presents the text to professional translators in an optimum manner.

We parse these files using an application named Weblations Cypher®. It does for HTML what resource files do for software localizers - it extracts the text from the surrounding source code without interpreting or damaging the code in any way.

Any such HTML parsing application, must be routinely updated to handle the latest innovations in HTML file formats. It must reliably find text strings, no matter where they are hidden. Some places to look for it are: behind images (alt text), captured in images which need to be redesigned to be translated (GIFs, JPEG's), animated, simply hidden (meta content, keywords, content descriptions), in scripts (e.g., Java, JavaScript, VBscript) and in database tables.

After the resource files have been prepared, they need to be presented to the translators and editors in a front end application that has been optimized for human factors and that contains the usual suite of professional tools and utilities.

Some human factors to consider are ease of learning, compatibility with other popular word processing applications, ability to work with Asian character sets, screen design and keyboard interaction. Our solution, the Weblations Workspace®, speeds the learning process by running as an application within Microsoft Word 2000®. Anyone who knows how to use Word in any language is ready to translate the most complex web page. The screen design uses a clean, two-column format that is ideal both for comparing the original to the translated text, and for reading the translation for flow. At any time, the linguist can preview the actual web page in the computer's default browser.

Problem: Translating graphics

The graphics at your site that contain text (banners, buttons, titled photographs, etc.) must be translated consistently with the rest of the site, so that the titles in both text and graphics coincide.

Solution: Put translator and artist on same team

In most cases the best way to coordinate text and graphics is to sub out the graphics and text to one full-service agency.

There is little rocket science to the graphics work, but to do it efficiently the artist needs to be in the loop with the translator so that word choices can be optimized for the space available, spelling errors found and corrected, and so on.

If you would like your own artist to do the graphics, then be sure that your agency supplies you with a crystal clear document listing the translation for each image, based on the assumption that your artist doesn't speak the target language.

Problem: Updating the translation

The best webs change frequently and are often created by different contributors working in scattered geographies. Usually, no one is responsible for tracking changes to the editorial content. As a result, no one knows when or what has changed, nor even who changed it.

When a website is revised, it is common for part but not all of an HTML file to change. It is important to preserve the still valid translation of the unchanged text, both for editorial consistency and to avoid paying for the same translation many times over.

For these reasons, maintaining the translations up-to-date with the source language content can become an expensive logistical nightmare.

Solution: Versioning and Translation Memory

The solution is to outsource the entire problem of multilingual site maintenance to a translation agency equipped to handle it.

Maintaining a website is similar to maintaining software under development. Many of the same techniques can be used. Your translation agency should demonstrate that it understands these techniques in general, and that it applies them to websites in particular. There are two key features to ask about:

First, your agency's site maintenance system should be designed to run without any cooperation or input from the people who manage the site. This gives you the maximum freedom to invite contributions and participation from anyone - employees, outside agencies, customers and the general public - without the burden of monitoring what they are doing. It falls to your translation agency version your site -- to tell you which pages have changed, added or deleted -- on a periodic basis. Your role is to read the version report and decide which pages should be translated, and to what languages.

Second, the translation agency should use translation memory to take advantage of all the text that has already been translated. Translation memory is a class of software that simply remembers everything that a translator has done and suggests the stored translation where appropriate. These applications can be complex and costly to install, and require discipline to use, but for large websites, the payoffs are worth it:

  • you save time to publication by automating the repeated portion of the translation,
  • once translated, a given section remains the same, with no drift in word choice, and
  • you should be able to realize a substantial cost savings for the portion of the website that is being processed by translation memory. This is called paying on a delta word basis. It means that if you have a page in which you change only one paragraph on a weekly basis, your agency should be able to identify, translate and charge you for that paragraph alone, not for the entire page.

Problem: Translating databases

If all or part of your website is stored in a database, then the agency that translates it will need to be familiar with multilingual databases and the Web! If not, the database that aids you to maintain your site in one language could become a complicating factor in other languages.

Solution: Choose a localization agency that knows databases

For starters, your agency's project manager should interview you to learn how the material gets into the database. Did you build a browser-based front end for editorial contributors? Did you buy a third party content management application that uses a database in the background? Did your webmasters write scripts to import/export data from other formats? The answers to these questions and others will help to design a workflow both for the initial translation and its ongoing maintenance.

Next, your database schema, character encodings and data types need to be reviewed and perhaps adjusted to comply with the new multilingual demands.

Finally, jumping ahead to quality control of the initial translation, you should stage the database translation in a password-protected area before going live, so that your localization agency can see its work as completed web pages, not as the scattered components that appear in the tables and templates of your server. An interactive, application-type web site is difficult to translate partly because the context for text strings is only evident when they are combined by the server application in real time. Hence the need for server-based quality control.

Weblations Cypher and Weblations Workspace are registered trademarks of Weblations, S.L.
Microsoft FrontPage and Word are registered trademarks of Microsoft Corporation.
WebEdit is a registered trademark of Nesbitt Software.
Trados is a registered trademark of Trados GmbH.



Home Request Quote Tour Workflow Cases & Samples Clients Blog
White Papers Company info News FAQ Employment Contact us