<?xml version='1.0' encoding='UTF-8'?><rss xmlns:atom='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' version='2.0'><channel><atom:id>tag:blogger.com,1999:blog-10854917</atom:id><lastBuildDate>Tue, 28 Nov 2006 12:58:53 +0000</lastBuildDate><title>Weblations knowledge bits</title><description></description><link>http://www.weblations.com/eng/blog/index.html</link><managingEditor>Robert</managingEditor><generator>Blogger</generator><openSearch:totalResults>7</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110865138845503444</guid><pubDate>Fri, 25 Feb 2005 18:41:00 +0000</pubDate><atom:updated>2005-04-04T10:48:46.893+01:00</atom:updated><title>Sniff your web</title><description>&lt;span style="font-family:Arial;"&gt;Before you publish any web content in a fancy character set, be sure to sniff your web. No, it shouldn't smell. Sniff for your web server's HTTP response header.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;The problem&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;If there is a clash between the HTTP header and the character set of the page you are publishing, most browsers will try to render the page according to the HTTP header. The result can be gibberish.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;img src="http://www.weblations.com/eng/blog/img/gibberish.gif" /&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Here for example is a Japanese page, encoded in UTF-8, being delivered by a server that claims it is encoded in ISO-8859-1, the encoding for standard Western European languages. Not a pretty site!&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;img src="http://www.weblations.com/eng/blog/img/japanese.gif" /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;We were hoping to see something like this. What happened?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Background&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;When a browser renders a web page, it needs to know what character set the page is encoded in. There are three ways it can find out:&lt;/span&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;Read the META tag in the header of the file,&lt;/span&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;study the text on the page and guess, or&lt;/span&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;read the web server's HTTP response header.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-family:Arial;"&gt;Before going any further, let's be clear: the &lt;em&gt;file header&lt;/em&gt; mentioned in method 1 is the stuff between the &amp;lt;HEAD&amp;gt; and &amp;lt;/HEAD&amp;gt; tag at the beginning of an HTML file. The &lt;em&gt;HTTP response header&lt;/em&gt; of method 3 is the data that the server sends to your browser before it begins to send the HTML file. You can't see it without the aid of a web sniffer.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Method 3 is the least widely known and the most powerful. It's the gotcha method. &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;The &amp;lt;META&amp;gt; tag of the first method is the most popular, with good reason, because it allows you to specify (and change) your encoding from one file to the next. Here is the meta tag for UTF-8:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;font-size:85%;"&gt;&amp;lt;meta http-equiv="Content-Type" content="text/html;charset=utf-8"&amp;gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I copied this tag from the header of the Japanese page above. Unfortunately, its server was singing a different tune.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:Arial;"&gt;Web sniffing&lt;/span&gt;&lt;br /&gt;&lt;/strong&gt;&lt;span style="font-family:Arial;"&gt;Here's where web sniffing comes in. Got to &lt;a href="http://www.web-sniffer.net" target="_blank"&gt;http://www.web-sniffer.net&lt;/a&gt;, enter the URL of the website you are sniffing, and look at the results...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.weblations.com/eng/blog/img/websniffer.gif" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;The culprit is in the last line: "charset=ISO-8859-1". That's a direct conflict with the UTF-8 encoding that we actually intended. &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Solution&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Should you run down the hall to tell your webmaster to change the HTTP header to your new encoding? No! To do so would render your pages correctly while breaking those of everyone else.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;If your webmaster is extremely kind and patient (possibly True) and has a lot of extra time (definitely False), you could ask him or her to set a special encoding for your folders. But why bother. The easier and better practice is not to specify any charset at all in the HTTP header. It should say simply "text/html", so that the encoding of each page can vary according to the capricho of its author.&lt;/span&gt;</description><link>http://www.weblations.com/eng/blog/2005/02/sniff-your-web.html</link><author>Robert</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110865126888366194</guid><pubDate>Wed, 02 Feb 2005 09:40:00 +0000</pubDate><atom:updated>2005-02-25T17:46:13.333Z</atom:updated><title>ClearType reduces eye fatigue</title><description>&lt;span style="font-family:Arial;"&gt;Ten years ago, in the epoch of flickering 640 x 480 screens, conscientious proofreaders would print out their texts and work on paper. Errors that were invisible on the screen seemed to jump out from the page. But printing involved an extra step. It was tempting to be lazy.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Now, screens have improved. Most people review text on the screen directly to save time and benefit from improved tools for spell checking, revision tracking, glossaries and so on. But eye fatigue is still a factor, and may have suddenly worsened thanks to flat LCD screens&lt;/span&gt;&lt;span style="font-family:Arial;"&gt;, which sharpen the edges around each pixel and render type with a choppier look than did the old CRT screens.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;To solve all that, Microsoft has released a great little tool especially useful for flat screens, the &lt;strong&gt;ClearType Tuner&lt;/strong&gt;. According to the &lt;a href="http://www.microsoft.com/windowsxp/downloads/powertoys/xppowertoys.mspx" target="_blank"&gt;download page&lt;/a&gt;: "This PowerToy lets you use ClearType technology to make it easier to read text on your screen, and installs in the Control Panel for easy access." (Windows XP only.)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Give it a try. The result is frankly luxurious, a whole lot easier on the eyes. You'll feel like you paid an extra couple of hundred bucks for your display.&lt;/span&gt;&lt;img src="http://www.weblations.com/eng/blog/img/ct.gif" /&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Here's an example with ClearType on the left and standard type on the right. &lt;/span&gt;</description><link>http://www.weblations.com/eng/blog/2005/02/cleartype-reduces-eye-fatigue.html</link><author>Robert</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110900939551555557</guid><pubDate>Wed, 16 Feb 2005 18:07:00 +0000</pubDate><atom:updated>2005-02-22T17:52:58.756Z</atom:updated><title>Ditch the RFP, and other translation outsourcing t...</title><description>&lt;span style="font-family:arial;"&gt;A lot of people who need to buy a website translation have never bought one before. If you fall in this category, read this posting. It includes advice that can make you a tougher and more informed buyer, written by someone (me) who has written or reviewed more than 1,600 translation proposals over the last few years.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Insist on an itemized quote.&lt;/strong&gt; Ask each potential vendor to itemize and price every service that they will deliver. You should be satisfied that you really understand what your are buying, and what factors affect each line item. For example, some services are priced according to the word count while others vary according to the technology that your website uses.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Understand the word count.&lt;/strong&gt; The basic variable in a website translation proposal is, or should be, the word count. It takes time to translate words carefully. One way or the other, you're going to have to pay for that time. But you shouldn't have to pay as much to translate repeated strings of text -- a very common characteristic of websites. In a transparently written proposal, both the raw word count and the corrected word count after subtracting some portion of the repetitions will be clearly displayed.&lt;br /&gt;&lt;br /&gt;Once you understand the word count, insist that if the word count of your project varies between the bidding and production phases, that the final price vary proportionately, either up or down.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Negotiate your maintenance agreement up front. &lt;/strong&gt;Maintaining a multilingual website can cost more than translating it in the first place. If you plan to maintain your site for the long haul, be sure to discuss the options and negotiate the price up front while you still have the negotiating leverage. A common pricing strategy (to use a euphemism for "trick"), is to buy your account by under bidding on the initial project, then ratcheting up the fees later when you are locked in.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Don't get locked in. &lt;/strong&gt;Speaking of which, you should be aware of two fundamental ways that a translation service provider can gain leverage as your incumbent. Both strategies are not adversarial tricks. On the contrary, when used with your interests in mind, they can be beneficial to you.&lt;br /&gt;&lt;br /&gt;The first power factor in the client-incumbent relationship is translation memory (TM), a software technology that sophisticated translators use to remember and reuse everything that they translated for you. If your website contains a lot of repeated descriptions of similar products and such, then with time, your incumbent translator will gain a pricing advantage over all other competitors, making it prohibitively expensive for you to change. If the pricing advantage is passed on to you, then fine, but if it is used to increase your vendor's profit margin and tactical pricing ability, then you lose. The solution? I'll discuss it in a separate posting.&lt;br /&gt;&lt;br /&gt;The second power factor is any kind of multilingual content management software that the vendor sells and installs on your server. These systems are also known as global management systems, GMS. Granted, certain sections of most websites can benefit from structured, multilingual content management. But not &lt;em&gt;all&lt;/em&gt; of the content on &lt;em&gt;all&lt;/em&gt; websites, and not if the license costs &lt;em&gt;all&lt;/em&gt; of the money in the world. The downside to these systems, in addition to their cost and complexity, is the leverage that they give to the system vendor.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Ask for the name of, and interview&lt;/strong&gt;&lt;/span&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;, your project manager.&lt;/strong&gt; Your most contact at the vendor is the project manager that they assign to your account. If you are being sold by an account representative, respect the work that he or she is doing for you during the selling process, but ask to speak to the project manager. The P.M. will take the reins, build your project team, set the calendar, handle your inevitable change order requests, manage the quality control process and all the file transfers. If you like the qualifications, profile, experience and personality of the designated project manager, you will be in good hands.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Ask for the names, resumes and work locations of your linguists.&lt;/strong&gt; Most translation vendors justifiably make a lot of noise about using native, professional linguists who are resident in the target language country. They may also claim to use more than just a single translator, and thusly to differentiate themselves as an agency from a less expensive freelance translator.&lt;br /&gt;&lt;br /&gt;Linguists are the heart and quality of the service that you are purchasing. So it makes sense to have the vendors show you the résumés of your linguists. They should also be able show that your documents have been edited, reviewed or proofread by someone in addition to the original translator -- if that is what they claim to do.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Compare apples to apples.&lt;/strong&gt; Price and service comparisons for a complex solution have to be compared with care. A website translation can range greatly in completeness, or lack of it. At one extreme are the agencies that simply translate a Word document for you and leave it to you to paste the translation into your HTML code, database or whatever. At the other extreme are agencies (like mine) that offer a lot of very helpful consulting during the planning phase, work on a turnkey basis with native assets and do a conscientious and systematic job of quality control. The price for a complete service can be higher while the true cost to you is significantly lower.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Ditch the RFP.&lt;/strong&gt; Last but not least, if you are about to write a Request for Proposal (RFP), think the better of it. Did you know that less than 10% of all RFPs for website translations result in actual projects? The very fact that someone in your organization is asking for an RFP should be a warning to you that the project is going to be studied and tabled, forever. Save yourself the trouble. If you are forced to write an RFP, let me know; I could be persuaded to blog about what every RFP should and should not contain.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;</description><link>http://www.weblations.com/eng/blog/2005/02/ditch-rfp-and-other-translation.html</link><author>Robert</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110848761810647457</guid><pubDate>Sat, 01 Jan 2005 17:10:00 +0000</pubDate><atom:updated>2005-02-17T14:12:30.200Z</atom:updated><title>Welcome to Knowledge Bits</title><description>&lt;span style="font-family:arial;"&gt;I'm Robert Hopkins, the president of Weblations. Since early 1996, when I founded Weblations in Barcelona, my colleagues and I have been translating and localizing websites -- some 600 projects in all. We've learned a lot over the years, and developed some opinions as well. In this blog, we'll share some of the knowledge bits that we've picked up over the years. If you are running your own translation project, we hope that they make your life easier. If you're looking for an agency to outsource your project to, this blog will give you an insider's look at the issues we face on a daily basis, and hence the kind of skills you should look for.&lt;br /&gt;&lt;br /&gt;To translate a website, you and your team need to know about culture and languages, copy writing, the technologies used by the website that you are translating, graphic design and the tools that you use to do your work. In our case, when we started the company, there were no adequate tools to assist a professional, high volume website translation team, so we decided to develop our own. Fortunately, the Web wasn't very complicated back then, and we were able to put something together in relatively short order. Since then, as the Web has enriched, we have enriched our two core applications, Weblations Cypher and the Weblations Workspace. You'll hear a lot about them in this blog, I bet, since developing them takes up a big part of my working day.&lt;br /&gt;&lt;br /&gt;I didn't really start this blog on New Year's Day, but I will pretend I did to give me some breathing room and to remind me that this is a New Year's resolution to be honored.&lt;br /&gt;&lt;br /&gt;Enough of introductions...on to the blog!&lt;/span&gt;</description><link>http://www.weblations.com/eng/blog/2005/01/welcome-to-knowledge-bits.html</link><author>Robert</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110857824956001409</guid><pubDate>Tue, 11 Jan 2005 16:31:00 +0000</pubDate><atom:updated>2005-02-16T19:30:39.066Z</atom:updated><title>BOM -- Huh?</title><description>&lt;span style="font-family:Arial;"&gt;Every now and then, we wake up to an ugly technical surprise usually just before we're supposed to deliver a project to a client. One of the more memorable ones that has had a lot of repercussions was the day we met our first BOM.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:arial;"&gt;What's a BOM?&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;I'm embarrassed to admit it now, but there was a time when I didn't know what a BOM was. Had I Googled on it, I would have seen it just a few entries below the &lt;a href="http://www.bom.gov.au/"&gt;Australian Bureau of Meteorology&lt;/a&gt;, but Google isn't much help when you don't know the name of what you're looking for. (Aside: why did the intelligent scientists at this bureau incorrectly elevate the status of "of" to the "O" in BOM? Is it that they don't want to be known as the "BM?")&lt;/span&gt;&lt;br /&gt;&lt;p&gt;&lt;span style="font-family:Arial;"&gt;According to the &lt;a href="http://www.unicode.org/faq/utf_bom.html#BOM"&gt;BOM FAQ on the Unicode home page&lt;/a&gt;, BOM is short for &lt;em&gt;byte order mark&lt;/em&gt;, three fancy characters that you might find at the beginning of a UTF-8 file. (UTF-8 is a compact but universal character set that we use a lot for multilingual websites. This blog is published in UTF-8. More about that in another posting.) Ironically, UTF-8 comes in only one order, so a BOM doesn't matter in the way it does with full-bore Unicode.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;The problem with the BOM is that some text editors interpret and hide the characters. You don't know that they're there. Here's what they look like: &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;ï»¿&lt;/span&gt;&lt;/p&gt;&lt;span style="font-family:arial;"&gt;These babies mean "You are about to see some UTF-8!" Are they necessary? Usually not, because a well-formed web page should make that announcement anyway with a META tag. Are they dangerous? You bet!&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;Notepad for WindowsXP and the BOM&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;If you're like us and you want to review and approve every single byte that you deliver to a customer, an unexpected BOM can play havoc with your plans, with your and your tools' concept of how long the file is, and so on. &lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;We learned about the BOM at the same time that we were upgrading to WindowsXP from Win98. One of the many gotchas was seeing our humble Win98 Notepad be replaced by Notepad for WindowsXP, a much more take-charge program. The old Notepad shows you the BOM characters, while the new one does not! With the new Notepad, if you save a file and select at the UTF-8 as the encoding in the Save As dialog box, the BOM will be pre-pended to your file, an unnecessary but seemingly harmless little detail.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;But is it really that harmless?&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Let's look closely at the &lt;a href="http://www.unicode.org/faq/utf_bom.html#BOM"&gt;definition of a BOM&lt;/a&gt;: &lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;&lt;strong&gt;BOM.&lt;/strong&gt; The character code U+FEFF at the beginning of a data stream, where it can be used as a signature defining the byte order and encoding form, primarily of unmarked plaintext files.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;The operative term is "beginning." How can we be sure that a given text file that has been saved in Notepad, will be the beginning of a data stream? It may the footer of a web page, deployed with an include statement to the middle of another file. Now, unfortunately, our BOM has gone from the beginning of one file to the middle of another, and in that position it can do a lot of damage.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:Arial;"&gt;Another ugly place for a BOM to land is in a database entry, where it can ruin your best laid plans for running exact searches and concatenating strings.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:arial;"&gt;Forewarned is forearmed. Hope this helps.&lt;/span&gt;&lt;br /&gt;&lt;/p&gt;</description><link>http://www.weblations.com/eng/blog/2005/01/bom-huh.html</link><author>Robert</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110858201778649866</guid><pubDate>Sun, 16 Jan 2005 15:31:00 +0000</pubDate><atom:updated>2005-02-16T19:26:57.813Z</atom:updated><title>Naming files in a multilingual site</title><description>&lt;div align="left"&gt;&lt;span style="font-family:Arial;"&gt;If you plan on translating your website to several languages, one of your first issues will be devising a naming scheme for the translated files and folders. I have some advice for you.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Don't change the file names! Don't change the folder names! &lt;/span&gt;&lt;/div&gt;&lt;div align="left"&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt; &lt;/div&gt;&lt;div align="left"&gt;&lt;span style="font-family:Arial;"&gt;Instead, put the translated set of files in a new parent folder named for the target language, &lt;em&gt;e.g.&lt;/em&gt;, &lt;strong&gt;/ger&lt;/strong&gt; for German. All the original files and folders (now translated to German, of course) should branch out from here. That way, all of your relative links from file to file should work without modification.&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;div align="center"&gt;&lt;br /&gt;&lt;img src="http://www.weblations.com/eng/blog/img/folders.gif" /&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;The above illustration shows a well-structured trilingual website in English, German and Spanish. The site was originally designed in English to contain three folders: &lt;strong&gt;/images&lt;/strong&gt;, &lt;strong&gt;/press&lt;/strong&gt; and &lt;strong&gt;/products&lt;/strong&gt;. At the time, no one thought the site would be translated, so a folder for English was never contemplated, but that's OK.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Later, German and Spanish were added under the folders &lt;strong&gt;/ger&lt;/strong&gt; and &lt;strong&gt;/spa&lt;/strong&gt;. Below these language folders, the structure of the original site is replicated. The names of the HTML and image files remain exactly the same so that the relative links don't need to be touched. That's it.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;PDF files&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;I wrote this very basic posting to mention an exception that proves the rule: PDF file names. PDF files should incorporate their language in the file name itself, &lt;em&gt;e.g.&lt;/em&gt;, &lt;strong&gt;brochure_eng.pdf&lt;/strong&gt; and &lt;strong&gt;brochure_spa.pdf&lt;/strong&gt;. Why? Because PDF files have a life beyond the website where they are stored, as attachments to emails and as loose files on someone's hard disk. The file name should show the language that the file is written in. Furthermore, if the brochure is written in three languages under the same name and you want to attach all three versions to the same email message, you'll have a nasty little naming conflict.&lt;/span&gt;</description><link>http://www.weblations.com/eng/blog/2005/01/naming-files-in-multilingual-site.html</link><author>Robert</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-10854917.post-110849179334207073</guid><pubDate>Wed, 05 Jan 2005 17:59:00 +0000</pubDate><atom:updated>2005-02-16T16:21:47.720Z</atom:updated><title>If you want your website to sound like a robot...</title><description>&lt;span style="font-family:Arial;"&gt;...then use a machine to translate it!&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;You can use software to translate any kind of text. There are several sites that will do it for free. Perhaps the best are run by &lt;a href="http://www.systransoft.com/"&gt;Systran&lt;/a&gt; and &lt;a href="http://www.google.com/language_tools?hl=en"&gt;Google&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;As a demonstration of the state of the art of computational science, as a personal tool or as a novelty item, machine translation is wonderful, but the quality of the result isn't reliable enough to use for production purposes. It's one thing for an end user to say, "I need to understand the gist of this article about my company in a German newspaper." It's another thing altogether for a company to say, "Let's use a machine to translate our web site." In the former case, the user knowingly takes the risk of reading a botched or awkward translation. In the latter, users read a corporate website, unaware of the circumstances behind its translation. Their critical defenses are lowered, they're vulnerable to any kind of misinterpretation.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;How good is machine translation?&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Machine translation is surprisingly good, but in the end it's not good enough.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Look what Systran's engine does to the following sentence translated from English to Swedish and (for those of us who don't speak Swedish), back to English. I chose Swedish because it was at the top of their list.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.weblations.com/eng/blog/img/Systran.gif" border="1" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;strong&gt;English original:&lt;/strong&gt; I took my daughter for a walk in the park.&lt;br /&gt;&lt;strong&gt;Swedish translation:&lt;/strong&gt; Jag tog den min dottern för en gå i parkera.&lt;br /&gt;&lt;strong&gt;English back translation:&lt;/strong&gt; I took that my daughter for a to go in parking.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;Admittedly, this is not a fair test, because the back translation tends to magnify the errors in the first translation. The translation to Swedish is probably only half as bad as the back translation to English, but half of horrible is not good enough.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="font-family:Arial;"&gt;What good is machine translation?&lt;/span&gt; &lt;/strong&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;We've already described the most popular use of MT -- to allow end users to selectively and instantly understand the gist of an otherwise unintelligible document, especially on the Web, where the content and the tool are both simultaneously available. But this is largely an underground activity. It can't be controlled by the publisher of a website and it doesn't appear on any ledger as a cost savings.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;MT could save some serious money for a web publisher in one discrete area: the translation of a corporate knowledge base. KB's cost a lot to create and maintain in English, and are rarely translated to other languages. They utilize a constrained terminology and are written by a small group of expert authors who are trained to write in a precise and clear way. It wouldn't take much extra effort to condition their writing to be translated by a machine translation program armed with the corporate terminology.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;The resulting translated KB should carry an armor-plated disclaimer to the effect that the translation is not guaranteed to be any good at all, and should allow the user to read the original English article. Why not give it a try? Many of the non-American readers of KBs are technical trained professionals who are nominally bilingual of necessity, but would find it much easier to scan several articles in their native language before finding and studying the one that matters.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:Arial;"&gt;That's all for now. I have to take my daughter for a to go in parking.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;/p&gt;</description><link>http://www.weblations.com/eng/blog/2005/01/if-you-want-your-website-to-sound-like.html</link><author>Robert</author></item></channel></rss>
