In a presentation at the November 2007 TAUS Executive Forum, Gilles Martel,
Director of Resources Management, Corporate Services, for the Translation
Bureau of the Canadian Government, gave a sneak preview of the new vision for
online translation services currently in the works for the Canadian Government.
The blueprint centers around a fully-integrated infrastructure platform that
will provide a palette of "pull" services for different kinds of users.
The goal is to provide the professional translators (the Translation Bureau today handles 225,000 requests a year served by 1,000 in-house translators and originating from 50,000 requesters) with an almost completely technology-driven configuration, offering our clients a galaxy of human translation, post-edition services and terminology, with workflow functionality available via extranets and intranets.
The technology line-up covers terminology, bilingual corpora and translation automation. Says Gilles Martel, "The fifth version of the Termium terminology management system, now under development, will provide access to 1.5 million terms in four languages to our own translators and our partners. The database is maintained by a staff of some 50 terminologists."
Acting as a customized service repository, it will comprise two term bases: a 'core' (public access) containing general terminology, and a "chest of drawers" for different client-specific terms, requiring special access.
The translation memory server farm will contain over 200 TM repositories. This will provide resource leverage using a standard business model. To get an idea of the proposed scope of the new platform, the current bilingual corpus of 25 million words (or 2 million segments) is due to be stepped up to a total of 300 million words in the next 2 years. The model chosen to control the quality of the TMs is to get peers (translators who know the corpora) to evaluate them as a community, while automated alignment will be checked by student translators.
A further major novelty will be the introduction of a new statistical machine translation engine, currently being developed by a Canadian R&D center. Phase One of the plan is to train the system on 100 million words per language, using the French-English Canadian Hansard as the corpus.
Gilles Martel is careful to draw attention to the paradigm-breaking nature of this translation automation agenda. It means that a substantial proportion of the current translators will have to acquire new proficiencies to work on the STM output text as post-editors. Their productivity will be measured by comparing the time needed to post-edit against current rates for human translators working within the same subject field. "We anticipate a gain of up to 100 % in certain areas."
End quality will be calculated as a weighted ratio of the number of linguistic changes required on the MT output text during post-editing to deliver an acceptable output. This ratio is weighted because some changes are "minor" since they take less time.
The final stage in this process would ideally be to introduce author management tools where possible, so that the source text can be rendered maximally translatable. In the long run, then, the platform's users will be able to access tools and functionality for the whole production chain, from authoring content and ordering translations, through terminology management, to billing in a seamless integrated whole. All that is missing for the time being in this new translation environment, admits Gilles Martel, is a business model for properly funding the infrastructure. And as for other organizations evolving in such an environment and still counting words, will they be able to keep their current models?




