Wednesday, January 05, 2005

If you want your website to sound like a robot...

...then use a machine to translate it!

You can use software to translate any kind of text. There are several sites that will do it for free. Perhaps the best are run by Systran and Google.

As a demonstration of the state of the art of computational science, as a personal tool or as a novelty item, machine translation is wonderful, but the quality of the result isn't reliable enough to use for production purposes. It's one thing for an end user to say, "I need to understand the gist of this article about my company in a German newspaper." It's another thing altogether for a company to say, "Let's use a machine to translate our web site." In the former case, the user knowingly takes the risk of reading a botched or awkward translation. In the latter, users read a corporate website, unaware of the circumstances behind its translation. Their critical defenses are lowered, they're vulnerable to any kind of misinterpretation.

How good is machine translation?
Machine translation is surprisingly good, but in the end it's not good enough.

Look what Systran's engine does to the following sentence translated from English to Swedish and (for those of us who don't speak Swedish), back to English. I chose Swedish because it was at the top of their list.





English original: I took my daughter for a walk in the park.
Swedish translation: Jag tog den min dottern för en gå i parkera.
English back translation: I took that my daughter for a to go in parking.


Admittedly, this is not a fair test, because the back translation tends to magnify the errors in the first translation. The translation to Swedish is probably only half as bad as the back translation to English, but half of horrible is not good enough.

What good is machine translation?
We've already described the most popular use of MT -- to allow end users to selectively and instantly understand the gist of an otherwise unintelligible document, especially on the Web, where the content and the tool are both simultaneously available. But this is largely an underground activity. It can't be controlled by the publisher of a website and it doesn't appear on any ledger as a cost savings.

MT could save some serious money for a web publisher in one discrete area: the translation of a corporate knowledge base. KB's cost a lot to create and maintain in English, and are rarely translated to other languages. They utilize a constrained terminology and are written by a small group of expert authors who are trained to write in a precise and clear way. It wouldn't take much extra effort to condition their writing to be translated by a machine translation program armed with the corporate terminology.

The resulting translated KB should carry an armor-plated disclaimer to the effect that the translation is not guaranteed to be any good at all, and should allow the user to read the original English article. Why not give it a try? Many of the non-American readers of KBs are technical trained professionals who are nominally bilingual of necessity, but would find it much easier to scan several articles in their native language before finding and studying the one that matters.

That's all for now. I have to take my daughter for a to go in parking.

0 Comments:

Post a Comment

<< Home