TAUS - Enabling better translation

Tuesday
Feb 07th
Text size
  • Increase font size
  • Default font size
  • Decrease font size
Home > Publications > User cases > Imagine you are a small language service provider

Imagine you are a small language service provider



Eat or be eatenImagine you are a small language service provider (LSP), one of the thousands of translation agencies listed in the Yellow Pages of the world. You are kind-of midlife. Business is tough. You exist because of the words you sell but your word rates are under pressure every year. You’d like to think that you are an entrepreneur: that you are free to make choices. But what choices do you really have? Margins are being squeezed. Machine translation is suddenly the ‘talk of the industry’, and non-professionals – “crowdsourcers” – are willing to compete for your jobs, at least in some industries.

You feel trapped. Your biggest customers are the large international translation companies, even though you may have some good direct clients or accounts. Large Multiple Language Vendors (MLVs) can be the meanest when it comes to rates and payments. You would like to dump them, but you can’t, because they represent the lion’s share of your revenues. Where do you go from here? Anyone want to buy your business….? Not very likely, or it has to be that one very large customer. On a sunny day, you decide to be brave. You take control of your own destiny.

This is what Manuel Herranz did in 2005. He could have gone under. After eight years of hard labor as the European representative for a Japanese language company, he put his fortunes on a management buy-out to turn the company over. The HQ had gone into bankruptcy, and It was time to pack and leave…or turn this misery into something positive.

What many others in his position feared – machine translation – intrigued Manuel. He studied philology in Valencia, specializing in Latin, Greek, German, English and Catalan, but had gone on to Manchester to study mechanical engineering. He was destined for a life as a busy technical translator. And it was enjoyable in the good years as a language consultant for companies like Ford and Rolls Royce. It was never meant to be a struggle to stay alive. Perhaps it was the background in mechanical engineering that opened his eyes to the bright side of the translation industry.

In 2005 he took over B.I. Europa and started working out a plan. He attended conferences and took part in discussions about establishing an industry association for sharing language data. He studied the different approaches to machine translation (MT) technology. He began collaborating with the Polytechnic University of Valencia. And then he made up his mind: not simply to use MT, but to produce MT systems as well. So in 2007 he renamed his company Pangeanic, and the following year Pangeanic became one of the smallest founding members of the TAUS Data Association alongside giants like SDL, Lionbridge, Intel and Oracle.

Translation companies ten or hundred times bigger than Pangeanic were not even dreaming of building their own MT engines, let alone actively considering it. Why would they? Emotionally they’d much rather keep it outside the corporate door for as long as possible. But also they had heard all the horror stories about the failures, the costs and the bad quality. But Manuel didn’t hesitate for a second: controlling your own destiny in translation clearly means mastering the technology.

Together with his partners at the Valencia Polytechnic University Pangeanic began to look closely at the Moses open source statistical MT technology. The scarce translation data they managed to find in the early stages were used to test the various components and understand the bottlenecks. Other challenges came in different shapes: it was clear that a parser was needed to filter out the tags and meet industry needs. Furthermore, plain-text MT output was not always good enough if you need to supply and comply with proprietary formats.

A TMX workflow had to be created to import and export MT translated TMs. A lot of translation data are needed to learn how to optimize the results. In the summer of 2009 the TAUS Data Association (TDA) offered all founding members a period of ‘free pooling’ so they could test the data on their technologies. Pangeanic downloaded a couple of hundred million words and combined it with considerable volumes of translation memories from their customer Sony Europe. Sony Europe was only too happy to share their TMs with Pangeanic and TAUS Data Association (TDA) and have customized engines built.

There are many exciting lessons to be learned from testing and customizing a statistical MT engine, especially for a mechanical engineer. Hundreds of new engines have been tried, tested and delivered since production of MT engines started at Pangeanic in 2009. Not all of them are being used, but the knowledge gathered is invaluable. Failures teach as much as success in MT. Using a Moses engine is not just a matter of feeding it as much data as one can find. Yes, more data helps, but similar or consistent data is also important, as is style and genre. Data preparation and cleaning cannot be underestimated. Knowing how to tune the weighting factors for the language model and translation model training processes is equally critical. Perhaps it helps to play with the N-gram settings for specific languages. But what counts most of all is the data selection and preparation.

Although Moses technology is open source and available to everyone, what really makes the difference is knowledge about how to work with it and fine-tune the best features for each language combination. Perhaps it is still too early to judge whether Manuel has 20-20 foresight or was pushing the risk factor for a small LSP. But we do know MT is in demand. He plans to double his post-editing output from 30% of total business in 2010 to 60% in 2011. This will help him improve his margins on his translation business, while reducing costs and possibly winning new business. At the same time he is developing new business and revenues by building customized MT engines under the new brand PangeaMT.

If you have read this far, you may think that TAUS has had to lower its sights and publish advertorials. But the real reason we are publishing this story is that we all need an industry in which entrepreneurs take charge. We are not recommending that you use PangeaMT, or SDL/Trados/Language Weaver for that matter. We have seen the dangers of an industry that gets locked into a particular technology. Building your own MT technology means walking a path to technology independence from standard TM applications, and it is also a major challenge in an industry where even large corporations and MLVs have little choice.

We would like industry players to take control of their own destiny by embracing or developing the technology of their choice. Not so many doubt anymore that there is a role to play for machine translation. We have witnessed a massive change in mindsets in the last five years. Using or producing MT, that is the question. TAUS takes an active role in guiding and coaching LSPs to become users but also producers of MT. “Let a thousand MT systems bloom” was our slogan at last year’s User Conference. We now offer workshops and reports on how to implement open-source MT solutions and how to evaluate MT. Let us know if you like to get started building your own MT system and we will be happy to help you on your way. If you decide to be ‘just ‘an active user of MT, you may be interested in the TAUS reports and workshop on post-editing.

Resources




Русский (Translated by Logrus)


 

Comments  

 
-4 #1 Josephine Bacon 2011-08-15 17:02
We do not use ANY machine translation in my company, we do not act like a postbox either, taking in translations and sending them back again without any quality control. For the sort of work we do, mainly legal and publishing work, it is the quality of the translation that counts. We can tell immediately if what we are getting has gone through MT because of the hilarious mistakes. For instance, a translation we got back from Trados users had translated the French "vol" as "flight" when it was obvious from the context that it was "volume" with the full stop left out. But Trados is only as good as the translator who use it, and the best translators do not use it.
Quote
 

Add comment


Security code
Refresh

RECENTLY PUBLISHED

Machine translation and Asian languages
Expect to see rising levels of demand for translation into and across the region’s languages for sometime to come.


This CEO is here to stay
First article in the sustainable growth series.


The nuts and bolts of self-service MT
Finding the right tools for self-service MT implementation is often a challenge.


Interoperability and open tools
Attendees rose to the challenge of developing user-friendly solutions to implement or work around standards of efficiency and savings at the TAUS User Conference 2011.


Translation quality evaluation is catching up with the times
Quality is when the buyer or customer is satisfied. In the translation industry, quality measurement  is managed by quality gatekeepers.


MT spells mainstream translation
MT as usual served up some interesting new developments at the TAUS User Conference this year in Santa Clara.


The future for translators looks bright, but they will have to reinvent the profession first
Seven predictions and a survey presented at the 19th FIT Conference, San Francisco, August 2011


What machines still can't translate
The breakthroughs presented at the Annual Meeting of the Association for Computational Linguistics often define the future of computational linguistics for years to come.


NEWS

TAUS announces partnership with APET
2nd February, 2012, Amsterdam – TAUS, the translation innovation think tank, announces a partnership with APET (Portuguese Association of Translation Companies).


TAUS announces the launch of the TAUS Tracker
8th December, 2011, Amsterdam – TAUS, an innovation think tank and interoperability watchdog for the translation industry, announces the official launch of the TAUS Tracker.


TAUS and Translators Association of China sign partnership agreement
22nd November, 2011, Amsterdam – TAUS, the translation innovation think tank, and Translators Association of China (TAC) have entered a partnership, allowing their respective members access to the networks, communities and knowledge of the two organizations.


JOIN OUR MAILING LIST

OTHER TAUS SITES

TRANSLATION AUTOMATION TIMELINE

At TAUS we're forward-thinking. Which means we try to know our history. So explore with us the story of translation automation in the digital age. See timeline