Ir al contenido

Documat


Procesamiento de Expresiones Multipalabra en gallego mediante Aprendizaje Profundo

  • Autores: Yerai Doval, Elmurod Kuriyozov, Víctor Manuel Darriba Bilbao Árbol académico
  • Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 67, 2021, págs. 45-57
  • Idioma: español
  • Títulos paralelos:
    • Multiword expressions processing in Galician using Deep Learning
  • Enlaces
  • Resumen
    • español

      l tratamiento de Expresiones Multipalabra es todavía una tarea pendiente en el Procesamiento del Lenguaje Natural. En este trabajo pretendemos determinar experimentalmente la utilidad de los modelos de Aprendizaje Automático para el procesamiento de Expresiones Multipalabra en gallego. Para ello usamos CORGA, un corpus con 40 millones de palabras, con el cual entrenamos modelos transformer de Aprendizaje Profundo, y comparamos su rendimiento con el de modelos más tradicionales de campo aleatorio condicional.

    • English

      Treatment of Multiword Expressions is still a pending task in Natural Language Processing. In this work, we want to experimentally determine the usefulness of Machine Learning models for Multiword Expression processing in Galician. With that aim, we use CORGA, a 40 million word corpus, with which we train Deep Learning-based transformers, comparing their performances with those of more traditional conditional random fields.

  • Referencias bibliográficas
    • Blunsom, P. y T. Baldwin. 2006. Multilingual deep lexical acquisition for HPSGs via supertagging. En Proceedings of the 2006 Conference on...
    • Candito, M. y M. Constant. 2014. Strategies for Contiguous Multiword Expression Analysis and Dependency Parsing. En ACL’14 - The 52nd Annual...
    • Centro Ramón Piñeiro para a Investigación en Humanidades. 2019a. Corpus de Referencia do Galego Actual (CORGA) [v3.2]. http://corpus.cirp.gal/corga/.
    • Centro Ramón Piñeiro para a Investigación en Humanidades. 2019b. Etiquetador/Lematizador do Galego Actual (XIADA) [v2.7]. http://corpus.cirp.gal/xiada/.
    • Cho, K., B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, y Y. Bengio. 2014. Learning phrase representations using RNN...
    • Conneau, A., K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, y V. Stoyanov. 2020. Unsupervised...
    • Constant, M., G. Eryi˘git, J. Monti, L. van der Plas, C. Ramisch, M. Rosner, y A. Todirascu. 2017. Multiword Expression Processing: A Survey....
    • Constant, M. y J. Nivre. 2016. A transitionbased system for joint lexical and syntactic analysis. En Proceedings of the 54th Annual Meeting...
    • Constant, M. y A. Sigogne. 2011. MWUaware part-of-speech tagging with a CRF model and lexical resources. En Proceedings of the Workshop on...
    • Discriminative strategies to integrate multiword expression recognition and parsing. En Proceedings of the 50th Annual Meeting of the Association...
    • Devlin, J., M.-W. Chang, K. Lee, y K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. En...
    • Diab, M. y P. Bhutada. 2009. Verb noun construction MWE token classification. En Proceedings of the Workshop on Multiword Expressions: Identification,...
    • Domínguez Noya, E. M., M. S. López Martínez, y F. M. Barcala Rodríguez. 2019. O Corpus de Referencia do Galego actual (CORGA): composición,...
    • Dubremetz, M. y J. Nivre. 2014. Extraction of nominal multiword expressions in French. En Proceedings of the 10th Workshop on Multiword Expressions (MWE),...
    • Farahmand, M. y R. Martins. 2014. A supervised model for extraction of multiword expressions, based on statistical context features. En Proceedings...
    • Firth, J. R. 1957. Papers in Linguistics, 1934-1951. Oxford University Press, London.
    • Green, S., M.-C. de Marneffe, J. Bauer, y C. D. Manning. 2011. Multiword expression identification with tree substitution grammars: A parsing...
    • Green, S., M.-C. de Marneffe, y C. D. Manning. 2013. Parsing models for identifying multiword expressions. Computational Linguistics, 39(1):195–227.
    • Hochreiter, S. y J. Schmidhuber. 1997. Long short-term memory. Neural Computation, 9(8):1735–1780, 11.
    • Jackendoff, R. 1997. Twistin’ the night away. Language, 73(3):534–559, Septiembre.
    • Klyueva, N., A. Doucet, y M. Straka. 2017. Neural networks for multi-word expression detection. En Proceedings of the 13th Workshop on Multiword...
    • Kurfali, M. 2020. TRAVIS at PARSEME shared task 2020: How good is (m)BERT at seeing the unseen? En Proceedings of the Joint Workshop on Multiword...
    • Lafferty, J. D., A. McCallum, y F. C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence...
    • Lapata, M. y A. Lascarides. 2003. Detecting novel compounds: The role of distributional evidence. En 10th Conference of the European Chapter...
    • Legrand, J. y R. Collobert. 2016. Phrase representations for multiword expressions. En Proceedings of the 12th Workshop on Multiword Expressions,...
    • Maldonado, A., L. Han, E. Moreau, A. Alsulaimani, K. D. Chowdhury, C. Vogel, y Q. Liu. 2017. Detection of verbal multiword expressions via...
    • Okazaki, N. 2007. CRFsuite: a fast implementation of Conditional Random Fields (CRFs). http://www.chokkan.org/software/crfsuite/.
    • Pecina, P. 2009. Lexical Association Measures: Collocation Extraction. UFAL, Praha, Czechia.
    • Ramisch, C. 2015. Multiword Expressions Acquisition: A Generic and Open Framework, volumen XIV de Theory and Applications of Natural Language...
    • Ramisch, C., A. Villavicencio, L. Moura, y M. Idiart. 2008. Picking them up and figuring them out: Verb-particle constructions, noise and...
    • Ramshaw, L. A. y M. Marcus. 1995. Text chunking using transformation-based learning. En D. Yarowsky y K. Church, editores, Third Workshop...
    • Riedl, M. y C. Biemann. 2016. Impact of MWE resources on multiword recognition. En Proceedings of the 12th Workshop on Multiword Expressions,...
    • Rondon, A., H. Caseli, y C. Ramisch. 2015. Never-ending multiword expressions learning. En Proceedings of the 11th Workshop on MWEs, páginas...
    • Schneider, N., E. Danchik, C. Dyer, y N. A. Smith. 2014. Discriminative lexical semantic segmentation with gaps: Running the MWE gamut. Transactions...
    • Schneider, N., D. Hovy, A. Johannsen, y M. Carpuat. 2016. SemEval-2016 task 10: Detecting minimal semantic units and their meanings (DiMSUM)....
    • Simkó, K. I., V. Kovács, y V. Vincze. 2017. USzeged: Identifying verbal multiword expressions with POS tagging and parsing techniques. En...
    • Taslimipoor, S., S. Bahaadini, y E. Kochmar. 2020. MTLB-STRUCT @Parseme 2020: Capturing unseen multiword expressions using multi-task...
    • Taslimipoor, S. y O. Rohanian. 2018. SHOMA at parseme shared task on automatic identification of vmwes: Neural multiword expression tagging...
    • Vincze, V., I. Nagy T., y G. Berend. 2011. Multiword expressions and named entities in the wiki50 corpus. En Proceedings of the International...
    • Vincze, V., J. Zsibrita, y I. Nagy T. 2013. Dependency parsing for identifying Hungarian light verb constructions. En Proceedings of the Sixth...
    • Wolf, T., L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von...
    • Zampieri, N., M. Scholivet, C. Ramisch, y B. Favre. 2018. Veyn at PARSEME shared task 2018: Recurrent neural networks for VMWE identification....

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno