A Bidirectional Recurrent Neural Language Model for Machine Translation

Àlvar Peris Blanes; Francisco Casacuberta Nolla

Ayuda

A Bidirectional Recurrent Neural Language Model for Machine Translation

Autores: Àlvar Peris Blanes, Francisco Casacuberta Nolla
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 55, 2015, págs. 109-116
Idioma: inglés
Títulos paralelos:
- Un modelo de lenguaje neuronal recurrente bidireccional para la traducci´on autom´atica
Enlaces
- Texto completo
Resumen
- español
  Se presenta un modelo de lenguaje basado en representaciones continuas de las palabras, el cual se ha aplicado a una tarea de traducción automática estadística. Este modelo está implementado por una red neuronal recurrente bidireccional, la cual es capaz de tener en cuenta el contexto pasado y futuro de una palabra para realizar predicciones. Debido su alto coste temporal de entrenamiento, para obtener datos de entrenamiento relevantes se emplea un algoritmo de selección de oraciones, el cual busca capturar información útil para traducir un determinado conjunto de test. Los resultados obtenidos muestran que el modelo neuronal entrenado con los datos seleccionados es capaz de mejorar los resultados obtenidos por un modelo de lenguaje de n-gramas.
- English
  A language model based in continuous representations of words is presented, which has been applied to a statistical machine translation task. This model is implemented by means of a bidirectional recurrent neural network, which is able to take into account both the past and the future context of a word in order to perform predictions. Due to its high temporal cost at training time, for obtaining relevant training data an instance selection algorithm is used, which aims to capture useful information for translating a test set. Obtained results show that the neural model trained with the selected data outperforms the results obtained by an n-gram language model.
Referencias bibliográficas
- Bahdanau, D., K. Cho, and Y. Bengio. 2014. Neural machine translation by jointly learning to align and translate. Technical report, arXiv...
- Baltescu, P., P. Blunsom, and H. Hoang. 2014. OxLM: A neural language modelling framework for machine translation. The Prague Bulletin of...
- Bengio, Y., R. Ducharme, P. Vincent, and C. Jauvin. 2003. A neural probabilistic language model. Machine Learning Research.
- Biçici, E. and D. Yuret. 2011. Instance selection for machine translation using feature decay algorithms. In Proceedings of the Sixth Workshop...
- Biçici, E. and D. Yuret. 2015. Optimizing instance selection for statistical machine translation with feature decay algorithms. Audio, Speech,...
- Chen, F. and J. Goodman. 1998. An empirical study of smoothing techniques for language modeling. Technical Report TR10-98, Computer Science...
- Devlin, J., R. Zbib, Z. Huang, T. Lamar, R. Schwartz, and J. Makhoul. 2014. Fast and robust neural network joint models for statistical machine...
- Eck, M., S. Vogel, and A. Waibel. 2005. Low cost portability for statistical machine translation based on n-gram frequency and TF-IDF. In...
- Gascó, G., M. A. Rocha, G. Sanchis-Trilles, J. Andrés-Ferrer, and F. Casacuberta. 2012. Does more data always yield better translations? In...
- Graves, A. 2013. Generating sequences with recurrent neural networks. arXiv:1308.0850 [cs.NE].
- Khadivi, S. and C. Goutte. 2003. Tools for corpus alignment and evaluation of the alignments (deliverable d4.9). Technical report, Technical...
- Luong, T., I. Sutskever, Q. V. Le, O. Vinyals, and W. Zaremba. 2014. Addressing the rare word problem in neural machine translation. arXiv...
- Mandal, A., D. Vergyri, W. Wang, J. Zheng, A. Stolcke, G. Tur, D. Hakkani-Tur, and N. F. Ayan. 2008. Efficient data selection for machine...
- Melder, J. A. and R. Nead. 1965. A simplex method for function minimization. The Computer Journal, 7(4):308–313.
- Mikolov, T. 2012. Statistical Language Models based on Neural Networks. Ph.D. thesis, Brno University of Technology. Ortiz-Martínez, D. and...
- Pascanu, R., Ç . Gül¸cehre, K. Cho, and Y. Bengio. 2014. How to construct deep recurrent neural networks.
- Schuster, M. and K. K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11):2673–2681.
- Schwenk, H. 2013. CSLM - a modular opensource continuous space language modeling toolkit. In INTERSPEECH, pages 1198–1202. ISCA.
- Sundermeyer, M., T. Alkhouli, J. Wuebker, and H. Ney. 2014. Translation modeling with bidirectional recurrent neural networks. In Proceedings...
- Sundermeyer, M., R. Schlüter, and H. Ney. 2012. LSTM neural networks for language modeling. In Interspeech, pages 194–197.
- Sutskever, I., O. Vinyals, and Q. V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing...
- Wang, R., H. Zhao, B-L. Lu, M. Utiyama, and E. Sumita. 2014. Neural network based bilingual language model growing for statistical machine...
- Werbos, P. J. 1990. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10):1550– 1560.