Something very significant happened in Canada last week concerning, of all things, machine translation. First came the Météo system; then the Hansard bilingual training corpus; and now this... On Monday, August 2, the British Columbia division of the RCMP 1 posted a note on its Website suggesting that citizens who wanted a French-language version of the English news releases published there should use Google Translate.
A spokesman for the federal government’s police force later explained that although the BC division is the largest in the country, it only has one French translator. As a result, citizens who had been requesting French translations sometimes had to wait three to four weeks. The BC Mounties had requested funding for an additional French translator but, owing to budgetary restrictions, that request had been denied. Using Google Translator, the spokesman claimed, citizens would be able to obtain a French version of these news releases (or a Punjabi, or Cantonese, or German one) almost instantaneously, and at no cost.
Well, it didn’t take long for the storm to break. Radio-Canada, the country’s French-language radio and television broadcaster, got wind of the RCMP announcement and published a story on it the next day. It turns out there’s a slight problem with this Google Translate solution. As part of the federal government, the RCMP is subject to the country’s Official Languages Act, a law which states that Canadians are entitled to receive services from all federal agencies in both English and French; and furthermore, there must be no difference in the quality of these linguistic services. Putting a match to the fuse, Radio-Canada kindly included a sample of a garbled Google output in their story.
The first salvo came from the President of the Federation of French-language Communities outside Quebec, who pointed out that it was somewhat ridiculous for an agency entrusted with enforcing the law to itself be openly contravening one. Next, the Commissioner of Official Languages weighed in, bluntly stating that Canadians had no business requesting the translation of government documents…from a machine! Several Members of Parliament quickly leapt on their high horses (which are not the same horses that the Mounties ride.) On Wednesday, August 3, the RCMP announced that it was removing the link to Google Translate from its BC Website.
Now that the dust is beginning to settle on this latest chapter in the colourful history of Canada’s language politics, perhaps we can shift the focus a little and ask what the broader implications of this incident are for translation and its automation. Needless to say, those of us with a professional interest in MT were not terribly surprised by the outcome of the Mounties’ short-lived dalliance with machine translation. Notwithstanding the remarkable progress that MT has made in recent years, which anyone who follows the field cannot help but have noticed, it takes a certain temerity to suggest that Canadians could somehow make do with raw, unedited Google output, knowing (presumably) the provisions of this country’s Official Languages Act. But then, temerity is something for which the RCMP has long been renowned.
So much for the MT professionals; but what about the general public? Here, the jury is still out and the verdict will probably turn out to be somewhat more nuanced. If there were any Canadians who actually believed that translation automation was a solved problem, like chess, they now presumably know better. Google Translate is not about to drive this country’s sizable corps of professional translators onto the dole. Nor is this likely to happen in the foreseeable future; at least not according to Franz Och, Google’s chief MT scientist, who was recently quoted as saying that “The trajectory we are on just doesn’t seem likely to reach artificial intelligence.”2 Which, using Bar-Hillel’s old terminology, I freely translate to mean that FAHQT, or fully automatic, high-quality translation of unrestricted texts, is not yet imminent.
In this sense, the RCMP incident may actually have been beneficial in educating the general public on the state of the art in machine translation. The danger, however, is that the pendulum will swing to the opposite extreme, and that people will conclude that MT systems, because they cannot translate perfectly, are of no use whatsoever and should not be called upon to translate at all. And that, of course, is patently false. The really important lesson that the public needs to draw from this incident is that quality requirements in translation are many and varied.
For certain types of texts – notably, those that are posted on corporate websites; or your own CV, to take an example closer to home – only the very best quality will do. Unedited machine translation is simply not up to this standard today. Highly qualified, professional human translators are still required, and very likely will continue to be required for a long time to come.
On the other hand, today’s MT systems are definitely capable of handling other sorts of translation tasks, where impeccable quality is not a sine qua non. And heaven knows, there is no shortage of these! Machine translation is currently making a tremendous contribution in helping to translate huge volumes of texts which would otherwise go untranslated. One of the better-known examples is provided by Microsoft, which now offers customers the option of automatically translating into nine target languages any of the tens of thousands of pages found on its product support site, using the company’s internally developed MT system.3
And then of course, there are the millions of Web pages that Google and other online MT engines are translating into a multitude of ‘foreign’ languages every day. The point isn’t just that all the professional translators in the world wouldn’t suffice to handle translation volume on this scale. More importantly, to assign this kind of task to human translators would represent a squandering of precious resources. Allow me to elaborate.
We previously mentioned the danger of unwarranted conclusions which the public could draw from what it perceives to be as the failure of MT, as typified in the incident involving the RCMP. There is another equally serious danger which, paradoxically, derives from MT’s recent success in handling these large volumes of what could be called ‘middle-profile’ documents. The danger is that this success will cause us to overlook the fact that the demand for high-quality translation is also growing by leaps and bounds and already outstrips the capacity of professional translators to satisfy it.
The burning question here is whether machine translation can assist human translators in meeting this ever-growing demand, and if so, how? Because in my humble opinion, translation memory systems are just not going to do the trick. The kinds of texts that require top-quality translation are too varied and often not repetitive enough to allow TM to become a major productivity booster. Human translators require more diverse kinds of assistance, from more powerful kinds of devices than the few CAT tools that are available today. Alas, there is far too little research that currently focuses on this critical issue.
To illustrate my point, imagine what would happen if a beneficent fairy were to magically descend on British Columbia and touch senior RCMP officers there with a spark of enlightenment. How would these enlightened managers then respond to the translation dilemma which the Mounties are presently grappling with? First of all, they might perhaps consult the one French translator who is working for them in BC, to verify whether she is equipped with the latest and most efficient CAT tools. When that translator later regained consciousness, she would likely tell her managers that she already has access to a TM system, but for the news releases that so urgently need to be translated into French, the database of past translations just doesn’t contain enough repetitions to significantly increase her production.
So what’s an enlightened translation manager to do? Provide the translator with access to ever larger, shared translation memories? Perhaps… But again, my intuition is that the kinds of news releases which the RCMP publishes on its Website are always going to contain a substantial proportion of novel or unseen sentences. Is there really nothing else that could help this translator turn these texts around more quickly, instead of having to translate them all manually?
Well, what about Google Translate, or other similarly advanced MT systems? Except this time, instead of publishing their raw output directly on the RCMP Website, these machine translations would first have to be revised in some way by a human translator. Now traditionally, MT revision has always followed a fairly standard pattern in which the machine first translates the source text on its own, and then the human intervenes to correct and polish the machine output. In the past, this kind of strict sequential interaction has not always proven successful, in large measure because the quality of the raw machine output has just not been up to snuff. Is this still the case with today’s best MT systems?
In other words, has the output quality of modern MT systems like Google improved to the point that post-editing it may now be cost-effective and productivity enhancing? I don’t believe that a simple, across-the-board answer to this complex question has yet emerged. For the moment, there are just too many variables involved, including the complexity of the type of source text, the availability of sufficient training material, the quality requirements of the final translation, etc. Nevertheless, more and more people are once more raising the question, which in itself surely tells us something.4
From the informal, impromptu tests that I’ve been conducting in recent years, my sense is that the best of today’s MT systems are close to reaching this important tipping point at which their raw output could indeed be cost effectively post-edited. Although we may not quite be there yet. And paradoxically, what remains to be done to get us over the top may be less a question of the quality of the raw translations that these systems produce than the manner in which translators interact with them. For one thing, much work remains to be done on the environment in which MT post-editing is conducted.5 Crystal ball gazing is not my forte. (If it were, I probably wouldn’t be sitting here writing this.)
Nevertheless, I will venture a few predictions on what we can hopefully expect to see in this area in coming years. I predict that translators will soon have access to far more sophisticated post-editing environments in which automatic speech recognition will play an increasingly important role. In these environments, the translator’s many and intensive interactions with her text will not all have to be channelled through the keyboard and mouse. Instead, she will be able to issue a variety of specialized vocal commands, such as: “Look up word x in dictionary D”; or “How did I translate word y earlier in this text?”; or “Change the translation of words a-b-c in this sentence to x-y-z.” If the required post-editing is light enough, the translator may even be able to forgo the keyboard altogether.
The other major change that I foresee in the manner in which translators and MT systems interact concerns the point at which the human calls on the machine for assistance. As mentioned above, in the standard post-editing paradigm, the machine first produces a raw translation which the translator then revises, without any opportunity for a more productive give-and-take between the two. Philip Koehn has recently been attempting to revive an interesting kind of interactive MT in which the interplay between the system and the translator is far tighter and more intense.6
In this approach, as soon as the translator begins to type the target equivalent of a given sentence, the MT system attempts to complete that translation for her. If the predicted completion is correct, the translator goes on to the next sentence; if not, she simply continues typing and the system responds with another prediction that takes into account her new input; and so on, back and forth, until a satisfactory translation is produced. As Francisco Casacuberta and his colleagues in Valencia have recently demonstrated,7 this interactive-predictive approach allows the system to actually learn from the user’s corrections and improve its predictions in real time as the translation advances. This may not yet be full artificial intelligence, but intelligent it surely it is!
Let us now descend from the clouds and come back down to earth, alighting once more in beautiful British Columbia, where, unfortunately, such intelligent MT systems and advanced vocal interfaces are not as yet available. Nor is it reasonable in the real world for us to expect that a pressured RCMP manager – no matter how enlightened – will propose organizing a machine translation trial to determine whether Google Translate can enable his one French translator to become substantially more productive.
But allow me for just a moment one last supposition. Suppose that instead of the beneficent fairy whom we previously conjured up, a malevolent, Machiavellian counterpart emerged from the netherworld to advise our harried translation manager. And suppose, furthermore, that the scoundrel suggested that our manager provoke an entirely predictable scandal by directing the French-speaking citizens in his province to an MT system for their translations; knowing full well that when that scandal erupted, the government would be forced to revise its decision to refuse the budget required to hire a second French-language translator…
Needless to say, this last is nothing but speculation of the very idlest sort; for we have no way of knowing how such decisions are reached within the secret confines of the RCMP’s senior management. On the other hand, we do now know the denouement to this whole affair. This week, the RCMP concluded an interim agreement with the Canadian Translation Bureau, which will help it translate the press releases on its BC Website, until such time as the Force can hire another full-time French translator. All of which lends a new resonance to the RCMP’s old motto: “The Mounties always get their man” – though not necessarily their machine.
1 RCMP stands for the Royal Canadian Mounted Police, whom Canadians fondly refer to as the Mounties. Their mandate is similar to that of the FBI in the United States.
2 Quoted by Jaap van der Meer in his recent TAUS posting: “Where are Facebook, Google, IBM and Microsoft taking us?”
3 Moreover, their Knowledge Base site is both updated and retranslated on a weekly basis, something that would be utterly impossible without recourse to MT. And apparently, Microsoft customers are quite satisfied with these less-than-perfect automatic translations.
4 Witness, for example, the number of sessions devoted to MT post-editing that will be taking place at the upcoming AMTA-2010 Conference.
5 Google may possess the most powerful MT engine today, owing in large part to its ability to access near-unlimited textual resources. On the other hand, the interface that Google offers on its Translator Toolkit site is surprisingly rudimentary, to put it kindly.
6 Philipp Koehn and Barry Haddow, (2009), “Interactive Assistance to Human Translators Using Statistical Machine Translation Methods”, in the Proceedings of MT Summit XII, Ottawa, Canada. As Koehn points out, the idea for this kind of interactive-predictive MT originated with the TransType project, on which I had the privilege of collaborating.
7 Daniel Ortiz-Martinez, Ismael Garcia-Varea and Francisco Casacuberta, (2010), “Online Learning for Interactive Statistical Machine Translation”, in the Proceedings of NAACL 2010, Los Angeles, California.
About the Author
Elliott Macklovitch
A linguist by training, Elliott Macklovitch has been actively involved in machine and machine-aided translation since 1977, when he joined the TAUM group at the University of Montreal. From 1986 to 1997, he worked at the Centre for Information Technology Innovation, an Industry Canada research lab, where he was responsible for the Translator’s Workstation project. In 1997, Elliott returned to the University of Montreal as a member (and later as the Coordinator) of the RALI Laboratory. He is currently Head of Product Development for Computer- assisted Translation at Onscope Group Inc., a Montreal-based company that specializes in legal services related to trademarks, but which also boasts a large internal translation service. Between 2000 and 2004, Elliott served as President of AMTA, the Association for Machine Translation in the Americas.






