
At the TAUS User conference held last week in Portland (OR), a number of presenters shared insights into their basic agenda for MT deployment: selecting engines, managing the post editing process, and making infrastructure cost-effective. Key to any MT deployment is the right management of expectations. And two LSPs offered a complimentary round of MT customization to new customers.
Cisco’s critical path
One active user of MT today is Cisco. But as Pablo Vazquez explained in a session on ‘navigating the best route’ to MT, when the company started out it had no idea what to expect from MT products in terms of quality for cost, or whether there were any standard practices for choosing the best solution.
Potential users today can in theory choose between free web-based and proprietary installed translation systems, and within that range between rule-based (RBMT) and statistical (SMT) engines.
But ticking off these options does not take you very far. Cisco decided to partner with the MT broker
Cross Language to work out what they wanted MT to do, and then evaluate the best solution.
As Cross Language’s Heidi Depraetere showed, you must always keep the bigger picture in mind; system requirements and costs are just as important as the usual focus on MT output quality. Even the quality question needs to be broken down into such factors as usability (i.e. who is going to
use the output), productivity (your throughput targets) and workflow (interactions with the whole system), rather than simply examining lists of opaque BLEU scores.
In this application, Cisco was going to use MT to translate support content. So it wanted a system that delivered user/context-defined quality (rather than publication standard, for example).
- The engine had to be customizable, and all linguistics assets, from dictionaries and TMs to rules, had to be engine-accessible using industry standards.
- The solution had to adapt to the existing Cisco environment (rather than vice versa), using APIs to in-house systems and CMSs.
- Target language quality had to drive engine choice, not an RBMT vs. SMT mindset. Users should be technology agnostic, and focus on the output result, not on how it is achieved. This means that multi-engine configurations, using a mix of engine technologies, may prove necessary.
- The system had to be scalable and have enough capacity to handle both on-the-fly and queued translation operations.
- The main cost criterion was full disclosure: the real price of the engine(s), customization, maintenance and upgrades. It was felt that MT suppliers have a tendency to play down the full cost of customization, for example.
This same checklist was then applied to each possible engine, so that minimum values for the price/time/quality parameters could be established for each. Cisco then ran a separate pilot for each engine tested, and then began to build a small though scalable process.
Autodesk deploys Moses
After experimenting with Systran, Apertium (an open source RBMT engine), and comparing RBMT solutions against SMT, Autodesk decided to go into production mode with a Moses (open source SMT) deployment for FIGS in mid-September 2009. Although, Apertium is used for Spanish to Brazilian Portuguese where it works well.
The key aim was to reduce the cost of product localization without impacting quality and scheduling. A longer term goal was to see how MT could be used in other application areas, such as wikis, chat, customer support, and community translation.
After extensive training and testing of the system to determine the right data mix, a Moses server was integrated with WordServer and Passolo technologies. A two-day productivity lab involving 12 translators from 3 vendors collected per-sentence timing data. MT engines were trained using 2008 content and translated 2009 material. All of the 12 translators saw their throughput significantly increase, with 6 of them at least doubling it.
Tests also involved comparing times and keystrokes used in human translation and in the post-editing (PE) of MT output of similar but not identical content. The results showed that with MT and PE, there was a huge productivity leap of 77% more content localized and post edited on a daily basis than with the former TM-driven translation workflow.
These tests also showed that a sentence length of around 20 words is optimum for both human and machine translation processing, and that overall, post-editing involves a smoother, shorter and more consistent keyboard process than human translation. But there is always considerable variation across individual post-editors and job types.
Final quality levels were very encouraging. In a series of ‘blind’ quality checks of the post-edited output, the evaluators were unable to distinguish post-edited from human translated quality, and the final output was deemed publication ready. However, there is as yet an inexplicable yet systematic variation in final quality between languages; Spanish output, for example, always requiring extra polishing.
The near-term goal is to raise translation productivity by 100%.
The post-editing dilemma
Post-editing (PE) as a skill set/engineering matrix is still in its infancy for most MT users. Experience of best practices is evolving fast but users of PE services are at different stages, and knowledge is still fragmentary, despite the anticipated explosion in demand. In a panel discussion on post-editing practices at the TAUS User Conference, representatives of Adobe, AlphaCRC, Intel, Microsoft, SDL and Sun Microsystems reported on their current findings.
There is a general perception that PE is a complex practice, that post-processing MT can take longer than direct translation, and that PE productivity is often lower than it should be.
Anyone entering the PE arena will need a proper PE strategy as there are considerable costs at stake. The vital step in training post editors is in every case to set the right expectations and ensure that these resources can work to the required quality level for the task at hand.
Background knowledge. Post editors and their managers should be fully aware of what they are getting, what the source text is like (was it authored to rules?), how the TMs are handled in the system, and what the confidence level of their translation resources are (terminology management, TM quality). They should also have experience of the specific engine in question, and this in turn depends on whether the MT is done in-house or out.
The vendor’s strategy is then to optimize each component in this process and test it systematically (source, terms, TM quality, etc). For example, PE does not always integrate smoothly into a GMS. And post editors will need some context from prior/next sentences to make the best decision.
In a customer support application, for example, it was recommended that MT users should avoid PE where possible, or limit it to error items that can be evaluated as correctable within 2 seconds. Otherwise PE becomes too expensive for the possible end user benefits. Publishing the source alongside the output is a cheaper solution in such cases.
Pricing. Today the price of PE is usually set as a percentage of the base rate of translation. However, post-editing output will vary, and this means that the base rate should vary too. It was suggested that in future there should be incentive schemes for MT suppliers: “if you can get your engine produce better output, I’ll give you a discount on the PE cost.”
A specialized human resource. No one as yet appears to building up a specialized post-editing resource. Post editors need to be part proof reader, part translator, and part technical analyst able to work out what’s going wrong with the MT engine. It is also very boring work: one participant suggested that "the system should generate a scary segment every 2 hours to keep them on their toes.”
It was agreed that the industry is still in a transition period, and that good PE training, setting explicit expectations, testing workflow components, and sharing experiences will gradually dissipate the PE fog.
Cost-effective MT infrastructure
The total ownership cost of a MT solution is almost certainly a barrier to broader take-up by smaller companies than Cisco or Autodesk. Although core open source engines such as Moses come free of charge, investments in scalable infrastructure, IT expertise and other resources can stack up extremely quickly and de-motivate both multiple language vendors and smaller publishers.
As an illustration of what can be done to lower the entry cost for anyone investing in an SMT solution, Achim Ruopp of Digital Silk Road showed how Amazon Web Services can be used as a third party infrastructure platform to run a Moses training/implementation.
The value of using platform services such as the
Amazon Elastic Compute Cloud (EC2), Simple Storage Service (S3), CloudFront and Simple Queue Services is that for the user there are no hardware, software, set-up or maintenance costs. You only pay for what you use; you have easy access to computing power for very fast engine training and decoding; and the whole system is secure and private.
Potential candidates for this solution, then, are Moses users seeking to save infrastructure costs, small and middle market companies starting to use (or even experiment with) SMT, and large users needing to scale up quickly to meet a new translation need. They can host their data in S3, train the engine and decode in EC2, and then evaluate the results locally using BLEU scores or human judgment.
As a cost comparison for a typical Moses training infrastructure, Amazon’s EC2 (using the discount Reserved Instances option) is priced at $0.17 an hour, whereas your own server would cost $0.26 and a hosted server $0.57. In a scenario involving the constant use of 30 MT systems a month, the Amazon Cloud solution would cost $125, compared to $185 for an own server, and $414 for a hosted server. As for using ten MT systems in burst mode for 24 hours, the Amazon solution would still cost $125 but a hosted solution could soar to $4,140 a month.
This service naturally requires some expertise in Amazon web services, and it is inherently unsuitable for language pairs where there are few resources, as is true for any SMT solution. But with the emergence of the TAUS Data Association (TDA), this brake on development may soon be history. Users would also need to handle their own data cleaning processes upstream.
LSPs point the way to collaboration
Choosing an MT system, putting the right infrastructure in place and optimizing a post editing strategy are time consuming processes that are ideally only done once. As early adopters among LSPs fine tune their installations and learn how to deliver value to their customers, two TAUS members who have embraced MT announced special offers at the TAUS User Conference last week.
As a gesture of collaboration and sharing in an industry traditionally used to cut-throat competition, Lexcelera (FR) and Pangeanic (SP) offered to train one MT engine free of charge for buyers seriously looking to find out more.
OTHER ARTICLES ON TAUS USER CONFERENCE 2009
-
Let a thousand MT systems bloom -
Putting language data sharing to work -
Connecting the parts: platforms, communities, standards -
Community building -
Localizing content for Customer Support -
Collective wisdom: Next steps for the industry