Can we design better software support for MT in general by wiring up post-editors in a lab and analyzing their behavior? Language technology researcher and founder of the Bioloom Group, Jürg Schütz suggests why we should.
Why does current post-editing practice need a new approach?
Post-editing today is seen as a translator task. It does not have a independent profile as an activity that we can model and for which we can design and implement support software. We need to guide post-editors more efficiently and enhance their cognitive capacities. We have no good qualitative and quantitative data as yet on post-editor activities.
If we can get a better understanding of the cognitive workload in post-editing, we shall also be able to influence the performance of MT engines themselves as adaptive systems, whether SMT or RBMT. If we can rethink these relationships between post-editor and MT system as a dynamic feedback relationship, we may be able to develop a model of post-editing that provides the ultimate quality assurance stage in the localization supply chain.
2. What are you doing to develop this kind of model of post-editing?
The research community should try and correlate data from laboratory experiments and interviews of post-editors with mathematical predictions, so we can build a multi-faceted model of the process. This can then be used as a basis for designing an interactive post-editing-aware system.
For example, eye movements and key strokes can be correlated so that we can correlate the resulting data along a time line with all the other component tasks, such as reading and understanding the source text and the MT output, and making the deletions, corrections, and revisions in the target text.
The data from this could then be used in machine learning sessions so that actions can be clustered and categorized. Other data sources, such as information from functional magnetic resonance imaging, could also be used to augment the data set at later stages of the research cycle.
3. How could this research vision be translated into a concrete agenda?
A project proposal for a 3 to 4 year project is due to be submitted to the Bio-ICT Convergence track in the European Commission's 7th Framework Program, led by the Copenhagen Business School and including partners such as Dublin City University.
In my opinion, the best way forward would be to involve post-editing experts, MT system vendors and users, together with cognitive, neuroscience and information bionics specialists. TAUS members would be natural candidates!
The TAUS take: There is renewed focus on post-editing, now that translation automation is expanding its offer of MT engines, language pairs, and implementation environments. Although post editing may seem to be a variation on manipulating linguistic output such as fuzzy matches from a TM system, it can involve more cross-language processing. Any research that ultimately helps "automate" post-editing tasks will be welcome to the industry.


