Automatic Acquisition of Machine Translation Resources in the Abu-MaTran Project

Antonio Toral Ruiz; Tommi Pirinen; Andy Way; Raphäel Rubino; Gema Ramírez Sánchez; Sergio Ortiz Rojas; Víctor M. Sánchez Cartagena; Jorge Fernández Tordera; Mikel L. Forcada Zubizarreta; Miquel Esplà Gomis; Nikola Ljubesic; Filip Klubicka; Prokopis Prokopidis; Vassilis Papavassiliou

Ayuda

Automatic Acquisition of Machine Translation Resources in the Abu-MaTran Project

Autores: Antonio Toral Ruiz , Tommi Pirinen, Andy Way , Raphäel Rubino, Gema Ramírez Sánchez, Sergio Ortiz Rojas, Víctor M. Sánchez Cartagena, Jorge Fernández Tordera, Mikel L. Forcada Zubizarreta , Miquel Esplà Gomis, Nikola Ljubesic, Filip Klubicka, Prokopis Prokopidis, Vassilis Papavassiliou
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 55, 2015, págs. 185-188
Idioma: inglés
Títulos paralelos:
- Adquisición automática de recursos para traducción automática en el proyecto Abu-MaTran
Enlaces
- Texto completo
Resumen
- español
  Este artículo presenta una panorámica de las actividades de investigación y desarrollo destinadas a aliviar el cuello de botella que supone la falta de recursos lingüísticos en el campo de la traducción automática que se han llevado a cabo en el ámbito del proyecto Abu-MaTran. Hemos desarrollado un conjunto de herramientas para la adquisición de los principales recursos requeridos por las dos aproximaciones m as comunes a la traducción automática, modelos estadísticos (corpus) y basados en reglas (diccionarios y reglas). Todas estas herramientas han sido publicadas con licencias libres y han sido desarrolladas con el objetivo de ser útiles para ser explotadas en el ámbito comercial.
- English
  This paper provides an overview of the research and development activities carried out to alleviate the language resources' bottleneck in machine translation within the Abu-MaTran project. We have developed a range of tools for the acquisition of the main resources required by the two most popular approaches to machine translation, i.e. statistical (corpora) and rule-based models (dictionaries and rules). All these tools have been released under open-source licenses and have been developed with the aim of being useful for industrial exploitation.
Referencias bibliográficas
- Esplà-Gomis, M. and M. L. Forcada. 2010. Combining content-based and url-based heuristics to harvest aligned bitexts from multilingual sites...
- Esplà-Gomis, M., V. M. Sánchez-Cartagena, J. A. Pérez-Ortiz, F. Sánchez-Martínez, M. L. Forcada, and R. C. Carrasco. 2014. An efficient method...
- Esplà-Gomis, M., F. Klubicka, N. Ljubesic, S. Ortiz-Rojas, V. Papavassiliou, and P. Prokopidis. 2014. Comparing two acquisition systems for...
- Forcada, M. L., S. Ortiz-Rojas, T. Pirinen, R. Rubino, and A. Toral. 2014a. Abu-MaTran deliverable D4.1b MT systems for the second development...
- Forcada, M. L., T. Pirinen, R. Rubino, and A. Toral. 2014b. Abu-MaTran deliverable D5.1b Evaluation of the MT systems deployed in the second...
- Ljubesic, N., D. Fiser, and T. Erjavec. 2014. TweetCaT: a Tool for Building Twitter Corpora of Smaller Languages. In Proceedings of the Ninth...
- Ljubesic, N. and F. Klubicka. 2014. {bs,hr,sr}WaC-web corpora of Bosnian, Croatian and Serbian. In Proceedings of the 9th Web as Corpus Workshop...
- Ljubesic, N. and A. Toral. 2014. cawac - a web corpus of catalan and its application to language modeling and machine translation. In Proceedings...
- Papavassiliou, V., P. Prokopidis, M. Esplà-Gomis, and S. Ortiz. 2014. Abu-MaTran deliverable D3.2. Corpora Acquisition Software. http://www.abumatran.eu/?page_id=59.
- Papavassiliou, V., P. Prokopidis, and G. Thurmair. 2013. A modular opensource focused crawler for mining monolingual and bilingual corpora...
- Rehm, G. and H. Uszkoreit. 2013. METANET Strategic Research Agenda for Multilingual Europe 2020. http://www.meta-net.eu/vision/reports/meta-net-sra-version_1.0.pdf....
- Rubino, R., M. Esplà-Gomi, A. Toral, V. Papavasiliou, and P. Prokopidis. 2015. DIY Domain Specific Parallel Corpora for Translators. In To...
- Sáanchez-Cartagena, V. M., J. A. Pérez-Ortiz, and F. Sánchez-Martínez. 2015. A generalised alignment template formalism and its application...
- Sánchez-Martínez, F. and M. L. Forcada. 2009. Inferring shallow-transfer machine translation rules from small parallel corpora. Journal of...