Ir al contenido

Documat


Exploring cross-lingual word embeddings for the inference of bilingual dictionaries

  • Marcos Garcia [1] ; Marcos García-Salido [1] ; Miguel A. Alonso [1]
    1. [1] Universidade da Coruña

      Universidade da Coruña

      A Coruña, España

  • Localización: Proceedings of TIAD-2019 Shared Task – Translation Inference Across Dictionaries co-located with the 2nd Language, Data and Knowledge Conference (LDK 2019): Leipzig, Germany, May 20, 2019 / Jorge Gracia (ed. lit.), Besim Kabashi (ed. lit.), Ilan Kernerman (ed. lit.), 2019, págs. 32-41
  • Idioma: inglés
  • Enlaces
  • Resumen
    • We describe four systems to generate automatically bilingual dictionaries based on existing ones: three transitive systems differing only in the pivot language used, and a system based on a different approach which only needs monolingual corpora in both the source and target languages. All four methods make use of cross-lingual word embeddings trained on monolingual corpora, and then mapped into a shared vec- tor space. Experimental results confirm that our strategy has a good coverage and recall, achieving a performance comparable to to the best submitted systems on the TIAD 2019 gold standard set among the teams participating at the TIAD shared task.

  • Referencias bibliográficas
    • Artetxe, M., Labaka, G., Agirre, E.: A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In: Proceedings...
    • Chen, X., Cardie, C.: Unsupervised multilingual word embeddings. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language...
    • Fiser, D., Sagot, B.: Constructing a poor man’s wordnet in a resource-rich world. Language Resources and Evaluation 49(3), 601–635 (Sep 2015)
    • Fung, P.: A statistical view on bilingual lexicon extraction: From parallel corpora to non-parallel corpora. In: Proceedings of the Third...
    • Fung, P., Yee, L.Y.: An IR approach for translating new words from nonparallel, comparable texts. In: Proceedings of the 36th Annual Meeting...
    • Gamallo, P., Garcia, M., Pineiro, C., Martinez-Castaño, R., Pichel, J.C.: LinguaKit: a Big Data-based multilingual tool for linguistic analysis...
    • Garcia, M., Gamallo, P.: Yet Another Suite of Multilingual NLP Tools. In: José-Luis Sierra-Rodríguez and José Paulo Leal and Alberto Simões...
    • Habash, N.: Introduction to Arabic Natural Language Processing. Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers...
    • Madhyastha, P.S., España-Bonet, C.: Learning bilingual projections of embeddings for vocabulary expansion in machine translation. In: Proceedings...
    • Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Workshop Proceedings of the...
    • Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Proceedings of the Eighth International Conference on Language...
    • Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R.: Multi-source, Multilingual Information Extraction and Summarization. Springer Publishing...
    • Prochasson, E., Fung, P.: Rare word translation extraction from aligned comparable documents. In: Proceedings of the 49th Annual Meeting of...
    • Rapp, R.: A methodology for bilingual lexicon extraction from comparable corpora. In: Proceedings of the Fourth Workshop on Hybrid Approaches...
    • Rapp, R., Xu, V., Zock, M., Sharoff, S., Forsyth, R., Babych, B., Chu, C., Nakazawa, T., Kurohashi, S.: New areas of application of comparable...
    • Řehůřek, R., Sojka, P.: Software Framework for Topic Modelling with Large Corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges...
    • Steinberger, R.: A survey of methods to easy the development of highly multilingual text mining applications. Language Resources and Evaluation...
    • Vilares, D., Gómez-Rodríguez, C., Alonso, M.A.: Universal, unsupervised (rule-based), uncovered sentiment analysis. Knowledge-Based Systems...
    • Wong, K.F., Li, W., Xu, R., sheng Zhang, Z.: Introduction to Chinese Natural Language Processing, Synthesis Lectures on Human Language Technologies,...

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno