Exploring cross-lingual word embeddings for the inference of bilingual dictionaries

Marcos García González; Marcos García Salido; Miguel Á. Alonso

Ayuda

Exploring cross-lingual word embeddings for the inference of bilingual dictionaries

Marcos Garcia ^[1] ; Marcos García-Salido ^[1] ; Miguel A. Alonso ^[1]
1. [1] Universidade da Coruña
  
  Universidade da Coruña
  
  A Coruña, España
Localización: Proceedings of TIAD-2019 Shared Task – Translation Inference Across Dictionaries co-located with the 2nd Language, Data and Knowledge Conference (LDK 2019): Leipzig, Germany, May 20, 2019 / Jorge Gracia (ed. lit.), Besim Kabashi (ed. lit.), Ilan Kernerman (ed. lit.), 2019, págs. 32-41
Idioma: inglés
Enlaces
- Texto completo (pdf)
Resumen
- We describe four systems to generate automatically bilingual dictionaries based on existing ones: three transitive systems differing only in the pivot language used, and a system based on a different approach which only needs monolingual corpora in both the source and target languages. All four methods make use of cross-lingual word embeddings trained on monolingual corpora, and then mapped into a shared vec- tor space. Experimental results confirm that our strategy has a good coverage and recall, achieving a performance comparable to to the best submitted systems on the TIAD 2019 gold standard set among the teams participating at the TIAD shared task.
Referencias bibliográficas
- Artetxe, M., Labaka, G., Agirre, E.: A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In: Proceedings...
- Chen, X., Cardie, C.: Unsupervised multilingual word embeddings. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language...
- Fiser, D., Sagot, B.: Constructing a poor man’s wordnet in a resource-rich world. Language Resources and Evaluation 49(3), 601–635 (Sep 2015)
- Fung, P.: A statistical view on bilingual lexicon extraction: From parallel corpora to non-parallel corpora. In: Proceedings of the Third...
- Fung, P., Yee, L.Y.: An IR approach for translating new words from nonparallel, comparable texts. In: Proceedings of the 36th Annual Meeting...
- Gamallo, P., Garcia, M., Pineiro, C., Martinez-Castaño, R., Pichel, J.C.: LinguaKit: a Big Data-based multilingual tool for linguistic analysis...
- Garcia, M., Gamallo, P.: Yet Another Suite of Multilingual NLP Tools. In: José-Luis Sierra-Rodríguez and José Paulo Leal and Alberto Simões...
- Habash, N.: Introduction to Arabic Natural Language Processing. Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers...
- Madhyastha, P.S., España-Bonet, C.: Learning bilingual projections of embeddings for vocabulary expansion in machine translation. In: Proceedings...
- Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Workshop Proceedings of the...
- Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Proceedings of the Eighth International Conference on Language...
- Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R.: Multi-source, Multilingual Information Extraction and Summarization. Springer Publishing...
- Prochasson, E., Fung, P.: Rare word translation extraction from aligned comparable documents. In: Proceedings of the 49th Annual Meeting of...
- Rapp, R.: A methodology for bilingual lexicon extraction from comparable corpora. In: Proceedings of the Fourth Workshop on Hybrid Approaches...
- Rapp, R., Xu, V., Zock, M., Sharoff, S., Forsyth, R., Babych, B., Chu, C., Nakazawa, T., Kurohashi, S.: New areas of application of comparable...
- Řehůřek, R., Sojka, P.: Software Framework for Topic Modelling with Large Corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges...
- Steinberger, R.: A survey of methods to easy the development of highly multilingual text mining applications. Language Resources and Evaluation...
- Vilares, D., Gómez-Rodríguez, C., Alonso, M.A.: Universal, unsupervised (rule-based), uncovered sentiment analysis. Knowledge-Based Systems...
- Wong, K.F., Li, W., Xu, R., sheng Zhang, Z.: Introduction to Chinese Natural Language Processing, Synthesis Lectures on Human Language Technologies,...