Ir al contenido

Documat


Improved Named Entity Recognition using Machine Translation-based Cross-lingual Information

  • Autores: Sandipan Dandapat, Andy Way Árbol académico
  • Localización: Computación y Sistemas (CyS), ISSN 1405-5546, ISSN-e 2007-9737, Vol. 20, Nº. 3, 2016, págs. 495-504
  • Idioma: inglés
  • DOI: 10.13053/cys-20-3-2468
  • Enlaces
  • Resumen
    • Abstract. In this paper, we describe a technique to improve named entity recognition in a resource-poor language (Hindi) by using cross-lingual information. We use an on-line machine translation system and a separate word alignment phase to find the projection of each Hindi word into the translated English sentence. We estimate the cross-lingual features using an English named entity recognizer and the alignment information. We use these cross-lingual features in a support vector machine-based classifier. The use of cross-lingual features improves F i score by 2.1 points absolute (2.9% relative) over a good-performing baseline model.

  • Referencias bibliográficas
    • Babych, B.,Hartley, A.. (2003). Improving machine translation quality with automatic named entity recognition. 7th International EAMT workshop...
    • Bharati, A.,Sangal, R.,Sharma, D. M.. (2007). Ssf: Shakti standard format guide. Language Technologies Research Centre. International Institute...
    • Bharati, A.,Sangal, R.,Sharma, D. M.,Bai, L.. (2006). Anncorra: Annotating corpora guidelines for pos and chunk annotation for Indian languages....
    • Bikel, D. M.,Miller, S.,Schwartz, R.,Weischedel, R.. (1997). Nymble: a high-performance learning name-finder. fifth conference on Applied...
    • Borthwick, A.. (1999). A maximum entropy approach to named entity recognition. Citeseer.
    • Burkett, D.,Blitzer, j.,Klein, D.. (2010). Joint parsing and alignment with weakly synchronized grammars. Human Language Technologies. 2010...
    • Burkett, D.,Petrov, S.,Blitzer, J.,Klein, D.. (2010). Learning better monolingual models with unannotated bilingual text. Fourteenth Conference...
    • Cortes, C.,Vapnik, V.. (1995). Support-vector networks. Machine learning. 20. 273
    • Das, D.,Petrov, S.. (2011). Unsupervised part-of-speech tagging with bilingual graph-based projections. 49th Annual Meeting of the Association...
    • Devi, S. L.,Malarkodi, C.,Marimuthu, K.,Chrompet, C.. (2013). Named entity recognizer for Indian languages. ICON NLP Tool Contest.
    • Ekbal, U. K. S. A.,Saha, S.. (2012). Differential evolution based feature selection and classifier ensemble for named entity recognition.
    • Finkel, J. R.,Grenager, T.,Manning, C.. (2005). Incorporating non-local information into information extraction systems by gibbs sampling....
    • Fleiss, J. L.. (1971). Measuring nominal scale agreement among many raters. Psychological bulletin. 76. 378
    • Grishman, R.. (1995). The NYU system for MUC- 6 or where’s the syntax?. 6th conference on Message understanding.
    • Haque, R.,Kumar Naskar, S.,Van Den Bosch, A.,Way, A.. (2010). Supertags as source language context in hierarchical phrase-based smt. Association...
    • Hermjakob, U.,Knight, K.,Daumé III, H.. (2008). Name translation in statistical machine translation- learning when to transliterate. ACL.
    • Kim, S.,Toutanova, K.,Yu, H.. (2012). Multilingual named entity recognition using parallel data and metadata from wikipedia. 50th Annual...
    • Li, W.,McCallum, A.. (2003). Rapid development of Hindi named entity recognition using conditional random fields and feature induction. ACM...
    • Liang, P.,Taskar, B.,Klein, D.. (2006). Alignment by agreement. conference on Human Language Technology Conference of the North American...
    • Ma, X.,Cieri, C.. (2006). Corpus support for machine translation at LDC. LREC.
    • Nagesh, A.,Ramakrishnan, G.,Chiticariu, L.,Krishnamurthy, R.,Dharkar, A.,Bhattacharyya, P.. (2012). Towards efficient named-entity rule induction...
    • Och, F. J.,Ney, H.. (2003). A systematic comparison of various statistical alignment models. Computational linguistics. 29. 19-51
    • Ratinov, L.,Roth, D.. (2009). Design challenges and misconceptions in named entity recognition. Thirteenth Conference on Computational Natural...
    • Richman, A. E.,Schone, P.. (2008). Mining Wiki resources for multilingual named entity recognition. ACL.
    • Saha, S. K.,Mitra, P.,Sarkar, S.. (2008). Word clustering and word selection based feature reduction for MaxEnt based Hindi NER. ACL.
    • Shah, R.,Lin, B.,Gershman, A.,Frederking, R.. (2010). SYNERGY: a named entity recognition system for resource-scarce languages such as Swahili...
    • Srihari, R.,Niu, C.,Li, W.. (2000). A hybrid approach for named entity and sub-type tagging. sixth conference on Applied natural language...
    • Ueffing, N.,Ney, H.. (2003). Using pos information for statistical machine translation into morphologically rich languages. tenth conference...
    • Wang, M.,Manning, C. D.. (2014). Cross-lingual pseudo-projected expectation regularization for weakly supervised learning. Transactions of...
    • Yarowsky, D.,Ngai, G.,Wicentowski, R.. (2001). Inducing multilingual text analysis tools via robust projection across aligned corpora. first...
Los metadatos del artículo han sido obtenidos de SciELO México

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno