Ir al contenido

Documat


Coreference Resolution for Morphologically Rich Languages. Adaptation of the Stanford System to Basque.

  • Autores: Ander Soraluze Irureta, Olatz Arregi Uriarte Árbol académico, Xabier Arregi Iparragirre Árbol académico, Arantza Díaz de Ilarraza Sánchez Árbol académico
  • Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 55, 2015, págs. 23-30
  • Idioma: inglés
  • Títulos paralelos:
    • Resoluci´on de coreferencia para lenguajes morfol´ogicamente ricas. Adaptaci´on del sistema de Stanford al euskera
  • Enlaces
  • Resumen
    • español

      Este artículo presenta el proceso de adaptación del sistema de resolución de coreferencia de Stanford para el euskera, un idioma aglutinante, de núcleo final y pro-drop. Este sistema ha sido integrado en una cadena de análisis lingüística de manera que recibe como entrada textos procesados y analizados para el euskera. Hemos demostrado que haciendo uso de las características lingüísticas del lenguaje se puede mejorar la resolución de la coreferencia. En el caso de los lenguajes aglutinantes el uso de características morfosintácticas mejora claramente el rendimiento del sistema obteniéndose un incremento en CoNLL F1 de 5 puntos para el caso de menciones automáticas y de 7,87 puntos con menciones gold.

    • English

      This paper presents the adaptation of the Stanford coreference resolution system to Basque, an agglutinative head-final pro-drop language. The adapted system has been integrated into a global linguistic analysis pipeline so that the input of the system are original Basque raw texts linguistically processed, and annotated. We demonstrate that language-specific characteristics have a noteworthy effect on coreference resolution. In the case of agglutinative languages the use of morphosyntactic features improves substantially the system's performance, obtaining a gain in CoNLL F1 results of 5 points when automatic mentions are used and of 7.87 points when gold mentions are provided.

  • Referencias bibliográficas
    • Aduriz, I., M. Aranzabe, J. M. Arriola, M. Atutxa, A. Dı́az de Ilarraza, N. Ezeiza, K. Gojenola, M. Oronoz, A. Soroa, and R. Urizar. 2006....
    • Alegria, I., O. Ansa, X. Artola, N. Ezeiza, K. Gojenola, and R. Urizar. 2004. Representation and Treatment of Multiword Expressions in Basque....
    • Alegria, I., M. Aranzabe, N. Ezeiza, A. Ezeiza, and R. Urizar. 2002. Using Finite State Technology in Natural Language Processing of Basque....
    • Alegria, I., X. Artola, K. Sarasola, and M. Urkia. 1996. Automatic Morphological Analysis of Basque. Literary & Linguistic Computing,...
    • Alegria, I., N. Ezeiza, I. Fernandez, and R. Urizar. 2003. Named Entity Recognition and Classification for texts in Basque. In II Jornadas...
    • Bagga, A. and B. Baldwin. 1998. Algorithms for Scoring Coreference Chains. In In The First International Conference on Language Resources...
    • Bengoetxea, K. and K. Gojenola. 2010. Application of Different Techniques to Dependency Parsing of Basque. In Proceedings of the NAACL HLT...
    • Björkelund, A. and R. Farkas. 2012. Data- driven Multilingual Coreference Resolution Using Resolver Stacking. In Joint Conference on EMNLP...
    • Broscheit, S., M. Poesio, S. P. Ponzetto, K. J. Rodriguez, L. Romano, O. Uryupina, Y. Versley, and R. Zanoli. 2010. BART: A multilingual Anaphora...
    • Chen, C. and V. Ng. 2012. Combining the Best of Two Worlds: A Hybrid Approach to Multilingual Coreference Resolution. In Joint Conference...
    • Chomsky, N. 1981. Lectures on Government and Binding. Studies in generative grammar. Foris publications, Dordrecht, Cinnaminson (R.I.).
    • Fernandes, E. R., C. N. dos Santos, and R. L. Milidiú. 2012. Latent Structure Perceptron with Feature Induction for Unrestricted Coreference...
    • Goenaga, I., O. Arregi, K. Ceberio, A. Dı́az de Ilarraza, and A. Jimeno. 2012. Automatic Coreference Annotation in Basque. In 11th International...
    • Kobdani, H. and H. Schütze. 2010. SUCRE: A Modular System for Coreference Resolution. In Proceedings of the 5th International Workshop on...
    • Kopeć, M. and M. Ogrodniczuk. 2012. Cre- ating a Coreference Resolution System for Polish. In Proceedings of the Eight International Conference...
    • Laka, I. 1996. A Brief Grammar of Euskara, the Basque Language. http://www.ehu.es/grammar. University of the Basque Country.
    • Lee, H., A. Chang, Y. Peirsman, N. Chambers, M. Surdeanu, and D. Jurafsky. 2013. Deterministic Coreference Resolu- tion Based on Entity-centric,...
    • Luo, X. 2005. On Coreference Resolu- tion Performance Metrics. In Proceedings of the Conference on Human Language Technology and Empirical...
    • Müller, C. and M. Strube. 2006. Multilevel Annotation of Linguistic Data with MMAX2. In Sabine Braun, Kurt Kohn, and Joybrato Mukherjee,...
    • Nivre, J., J. Hall, J. Nilsson, A. Chanev, G. Eryigit, S. Kübler, S. Marinov, and E. Marsi. 2007. MaltParser: A language- independent System...
    • Pradhan, S., E. Hovy, M. Marcus, M. Palmer, L. Ramshaw, and R. Weischedel. 2007. OntoNotes: A Unified Relational Semantic Representation....
    • Pradhan, S., X. Luo, M. Recasens, E. Hovy, V. Ng, and M. Strube. 2014. Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation....
    • Pradhan, S., A. Moschitti, N. Xue, O. Uryupina, and Y. Zhang. 2012. CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference...
    • Pradhan, S., L. Ramshaw, M. Marcus, M. Palmer, R. Weischedel, and N. Xue. 2011. CoNLL-2011 Shared Task: Modeling Unrestricted Coreference...
    • Recasens, M. and E. Hovy. 2011. BLANC: Implementing the Rand index for coreference evaluation. Natural Language Engineering, 17(4):485–510.
    • Recasens, M., L. Màrquez, E. Sapena, M. A. Mart́ı, M. Taulé, V. Hoste, M. Poesio, and Y. Versley. 2010. SemEval-2010 task 1: Coreference...
    • Soraluze, A., I. Alegria, O. Ansa, O. Arregi, and X. Arregi. 2011. Recognition and Classification of Numerical Entities in Basque. In RANLP,...
    • Soraluze, A., O. Arregi, X. Arregi, K. Ceberio, and A. Dı́az de Ilarraza. 2012. Mention Detection: First Steps in the Development of a Basque...
    • Uryupina, O. 2008. Error Analysis for Learning-based Coreference Resolution. In Proceedings of the Sixth International Conference on Language...
    • Uryupina, O. 2010. Corry: A System for Coreference Resolution. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages...
    • Vilain, M., J. Burger, J. Aberdeen, D. Connolly, and L. Hirschman. 1995. A Modeltheoretic Coreference Scoring Scheme. In Proceedings of the...
    • Zhekova, D. and S. Kübler. 2010. UBIU: A Language-independent System for Coreference Resolution. In Proceedings of the 5th International...

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno