Microsoft's commitment to machine translation has been large-scale and sustained. The company has been involved in Natural Language Processing (NLP) research and development since 1991, specifically in Machine Translation (MT) research since 1999, and has used MT in a production environment for just over 5 years. It has pursued a strategy of pragmatic, step by step advance over the long term, building on strong in-house research in natural language processing in general, in parallel with its development work on product localization and other applications.
After building a series of MT engines to production level for internal or partner applications, Microsoft decided to launch a public MT service in 2007, known as Live Translator, intended for individual web surfers, users of Live Search, and Microsoft Office users - i.e. the consumer market. The hybrid engine driving this service has been developed internally for several years as one of several approaches to translation automation. It is now capable of delivering 26 language pairs to serve the needs of multilingual work and communication in a networked world.
Prior to the launch of this recent consumer-oriented service, Microsoft had been using its translation technology in a customer support context, translating and publishing knowledge base documents for end users seeking to solve problems. This MT service is also used by internal teams to raw translate certain parts of products for betas or evaluation. And Microsoft also uses MT to localize parts of its product documentation for the marketplace, which is then post-edited by service providers in a typical localization workflow.
This multifaceted translation capability ultimately originates from the decision in 1991 to create an NLP group (inside what was then the new Microsoft Research facility), which expanded into a dedicated translation team around 1995. The research group eventually spun off a product development team that now handles the company's MT services, currently headed by Chris Wendt.
The underlying strategy is predicated on the following three principles:
- The recognition that MT output quality is and will be far from perfect for quite some time. The aim is therefore to provide practical "multilingual support" to end users in various situations (writers, communicators, customers, developers) to overcome the inherent deficiencies of MT.
- The translation service must be embedded into the workflow where it is needed (i.e. understanding documents, or communicating and collaborating with others), yet never be totally transparent. Users must always be able to track back to the original in case of doubt.
- MT will continue to require human support in social computing settings. For example, in improving post-editing productivity, or in using wiki-style approaches in developer communities on MSDN.
It remains to be seen whether Microsoft will attempt to monetize this considerable investment independently of integrating the technology into its own product and service mix.
In what follows, we focus on the technology options resulting of the R&D process and on the methods and choices applied to MT in a production environment. This TAUS report is intended to provide some insight into the background of this major MT venture, and into the details of how a leading IT product and services company deploys the technology over time.
Full report exclusively available to TAUS members >>> Microsoft Technology Report January 2009
| |


