Ir al contenido

Documat


A Multilingual Multi-domain Data-to-Text Natural Language Generation Approach

  • Autores: Cristina Barros Catalán, Elena Lloret Pastor Árbol académico
  • Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 58, 2017, págs. 45-52
  • Idioma: inglés
  • Títulos paralelos:
    • Un enfoque multilingüe y multidominio de datos-a-texto para la generación de lenguaje natural
  • Enlaces
  • Resumen
    • español

      La investigación en enfoques multidominio innovadores y flexibles puede ser un paso significativo en el área de Generación del Lenguaje Natural. En este sentido, el objetivo de este artículo es presentar un enfoque estadístico centrado en la fase de realización. Este enfoque permite la generación de oraciones que cumplan un propósito dado por una “característica semilla” de entrada, la cual se encargará de guiar el proceso de generación. Este enfoque ha sido probado en el ámbito de generar automáticamente oraciones que expresan opiniones para reseñas de películas y, además, el enfoque también ha sido probado en el ámbito de generación del lenguaje para tecnologías de apoyo a problemas relacionados con el lenguaje. Dados los resultados obtenidos, este enfoque es capaz de generar oraciones para dos dominios diferentes con un rendimiento similar en dos idiomas diferentes, obteniendo buenos resultados y cumpliendo los requisitos especificados para cada dominio.

    • English

      Research in innovative and flexible multi-domain approaches may be a significant step forward in the area of Natural Language Generation. In light of this, the aim of this paper is to present a statistical approach focused on the surface realisation stage. This approach allows the generation of sentences oriented to meet the purpose given by an specific input seed feature, that will guide all the generation process. Our approach was tested to automatically generate opinionated sentences in the domain of movie reviews and was also tested in the domain of Natural Language Generation for assistive technologies. Based on the results obtained, the approach has proved to be able to generate sentences in two different domains with similar performance and for two different languages, obtaining good results and fulfilling the requirements specified for each domain, which opens the door to be applied in new domains and applications.

  • Referencias bibliográficas
    • Ballesteros, M., B. Bohnet, S. Mille, and L. Wanner. 2015. Data-driven sentence generation with non-isomorphic trees. In Proceedings of the...
    • Bateman, J. and M. Zoch. 2003. Natural Language Generation. Oxford University Press.
    • Bilmes, J. A. and K. Kirchhoff. 2003. Factored language models and generalized parallel backoff. In Proceedings of the Human Language Technology...
    • Crego, J. M. and F. Yvon. 2010. Factored bilingual n-gram language models for statistical machine translation. Machine Translation, 24(2):159–175.
    • Cruz, F. L., J. A. Troyano, B. Pontes, and F. J. Ortega. 2014. Building layered, multilingual sentiment lexicons at synset and lemma levels....
    • Fernández, J., Y. Gutiérrez, J. M. Gómez, P. Mart́ınez-Barco, A. Montoyo, and R. Muñoz. 2013. Sentiment analysis of spanish tweets...
    • Ge, T., W. Pei, H. Ji, S. Li, B. Chang, and Z. Sui. 2015. Bring you to the past: Automatic generation of topically relevant event chronicles....
    • Gerani, S., Y. Mehdad, G. Carenini, R. T. Ng, and B. Nejat. 2014. Abstractive summarization of product reviews using discourse structure....
    • Isard, A., C. Brockmann, and J. Oberlander. 2006. Individuality and alignment in generated dialogues. In Proceedings of the INLG, pages 25–32....
    • Jacko, J. A. 2012. Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies, and Emerging Applications, Third Edition. CRC...
    • Liu, B., M. Hu, and J. Cheng. 2005. Opinion observer: analyzing and comparing opinions on the web. In Proceedings of the 14th international...
    • Mairesse, F. and S. Young. 2014. Stochastic language generation in dialogue using factored language models. Comput. Linguist., 40(4):763–799.
    • Novais, E. M. and I. Paraboni. 2012. Portuguese text generation using factored language models. Journal of the Brazilian Computer Society,...
    • Padró, L. and E. Stanilovsky. 2012. Freeling 3.0: Towards wider multilinguality. In Proceedings of the 8ht International Conference on Language...
    • Pang, B. and L. Lee. 2004. A sentimental education: Sentiment analysis using subjectivity. In Proceedings of the 42Nd Annual Meeting on Association...
    • Ramos-Soto, A., A. J. Bugaŕın, S. Barro, and J. Taboada. 2015. Linguistic descriptions for automatic generation of textual shortterm weather...
    • Randolph, J. J. 2008. Online kappa calculator [computer software]. retrieved from http://justus.randolph.name/kappa.
    • Reiter, E. and R. Dale. 2000. Building Natural Language Generation Systems. Cambridge University Press.
    • Resnik, P. and J. Lin, 2010. Evaluation of NLP Systems, pages 271–295. WileyBlackwell.
    • Rvachew, S., S. Rafaat, and M. Martin. 1999. Stimulability, speech perception skills, and the treatment of phonological disorders. American...
    • Stolcke, A. 2002. Srilm an extensible language modeling toolkit. In Proceedings International Conference on Spoken Language Processing, vol...
    • Strapparava, C. and A. Valitutti. 2004. Wordnet affect: an affective extension of wordnet. In LREC, volume 4, pages 1083– 1086.

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno