Implementing a neural machine translation engine for mobile devices: the Lingvanex use case

Zuzanna Parcheta; Germán Sanchis Trilles; Aliaksei Rudak; Siarhei Bratchenia

Ayuda

Implementing a neural machine translation engine for mobile devices: the Lingvanex use case

Autores: Zuzanna Parcheta, Germán Sanchis Trilles, Aliaksei Rudak, Siarhei Bratchenia
Localización: Proceedings of the 21st Annual Conference of the European Association for Machine Translation: 28-30 May 2018, Universitat d'Alacant, Alacant, Spain / coord. por Juan Antonio Pérez Ortiz , Felipe Sánchez Martínez , Miquel Esplà Gomis, Maja Popovic, Celia Rico Pérez , André Martins, Joachim Van den Bogaert, Mikel L. Forcada Zubizarreta , 2018, ISBN 978-84-09-01901-4, págs. 297-302
Idioma: inglés
Enlaces
- Texto completo
Resumen
- In this paper, we present the challenge entailed by implementing a mobile version of a neural machine translation system, where the goal is to maximise translation quality while minimising model size. We explain the whole process of implementing the translation engine on an English–Spanish example and we describe all the difficulties found and the solutions implemented. The main techniques used in this work are data selection by means of Infrequent n-gram Recovery, appending a special word at the end of each sentence, and generating additional samples without the final punctuation marks. The last two techniques were devised with the purpose of achieving a translation model that generates sentences without the final full stop, or other punctuation marks. Also, in this work, the Infrequent n-gram Recovery was used for the first time to create a new corpus, and not enlarge the in-domain dataset. Finally, we get a small size model with quality good enough to serve for daily use.