Neural networks draw on context to improve machine translations

Researchers at the University of Amsterdam are using neural networks to help a statistical machine translation systems learn what all human translators know—that the best translation of a word often depends on the context.

Machine translation systems such as Google Translate or those at guess how to translate words and phrases based on how often they appear in a large corpus of human-translated texts. Such tools are increasingly important as individuals and businesses seek to access information or buy products and services from other countries where different languages are spoken.

Statistical machine translation work by breaking sentences into phrase fragments and selecting the most likely translation for each fragment—a process that doesn’t always yield the best translation for the sentence as a whole in morphologically rich languages such as those where nouns are inflected for number, case and gender.

To improve the word selection of such systems when translating into morphologically rich languages such as Russian, Bulgarian and German, the team used a neural network to analyze the words in context in the source language.

Translating sentences into grammatically more complex languages is relatively easy for human translators because they understand the grammatical function of the word in a sentence. Machine translators however find it particularly difficult to do this because word forms from a grammatically more simple language like English do not contain enough information for producing the correct version of that word into a morphologically rich language.

It is for instance, difficult for machines to translate a sentence containing an English word form like “the man” into German because the German language offers several word forms—“der Mann”, “des Mannes”, “dem Mann” and “den Mann”—that could all be correct translations, depending on the context.

The neural network is able to derive grammatical functions of words without having explicit knowledge of the grammar, said Ke Tran, one of the researchers. This means that to learn word functions the method does not depend on examples hand-picked by the researchers, which can be a difficult and costly process, especially for languages with few speakers.

The researchers reported significant word translation prediction accuracy for Bulgarian, Czech, and Russian. Moreover, preliminary results for integrating the approach into a large-scale English-Russian statistical machine translation system show small but statistically significant improvements in translation quality, they said.

In the future, the new method will be integrated in a translation system called Oister, being developed by the university. The findings will also be presented during the conference on Empirical Methods on Natural Language Processing in Doha, Qatar next week.

New! Download the CIO March/April Digital Magazine