Ir al contenido

Documat


Using Semantic Graphs and Word Sense Disambiguation Techniques to Improve Text Summarization

  • Autores: Laura Plaza Morales Árbol académico, Alberto Díaz Esteban Árbol académico
  • Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 47, 2011, págs. 97-105
  • Idioma: inglés
  • Enlaces
  • Resumen
    • español

      En este trabajo se presenta un método para la generación automática de resúmenes basado en grafos semánticos. El sistema utiliza conceptos y relaciones de WordNet para construir un grafo que representa el documento, así como un algoritmo de clustering basado en la conectividad para descubrir los distintos temas tratados en él. La selección de oraciones para el resumen se realiza en función de la presencia en las oraciones de los conceptos más representativos del documento. Los experimentos realizados demuestran que el enfoque propuesto obtiene resultados significativamente mejores que otros sistemas evaluados bajo las mismas condiciones experimentales. Asimismo, el sistema puede ser fácilmente adaptado para trabajar con documentos de diferentes dominios, sin más que modificar la base de conocimiento y el método para identificar conceptos en el texto. Finalmente, este trabajo también estudia el efecto de la ambigüedad léxica en la generación de resúmenes.

    • English

      This paper presents a semantic graph-based method for extractive summarization. The summarizer uses WordNet concepts and relations to produce a semantic graph that represents the document, and a degree-based clustering algorithm is used to discover different themes or topics within the text. The selection of sentences for the summary is based on the presence in them of the most representative concepts for each topic. The method has proven to be an efficient approach to the identification of salient concepts and topics in free text. In a test on the DUC data for single document summarization, our system achieves significantly better results than previous approaches based on terms and mere syntactic information. Besides, the system can be easily ported to other domains, as it only requires modifying the knowledge base and the method for concept annotation. In addition, we address the problem of word ambiguity in semantic approaches to automatic summarization.

  • Referencias bibliográficas
    • Agirre, E. and A. Soroa. 2009. Personalizing PageRank for Word Sense Disambiguation. In Proceedings of the 12th Conference of the European...
    • Banerjee, S. and T. Pedersen. 2002. An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet. In Proceedings of the 3rd International...
    • Barabási, A.L. and R. Albert. 1999. Emergence of Scaling in Random Networks. Science, 268:509–512.
    • Bawakid, A. and M. Oussalah. 2008. A Semantic Summarization System: University of Birmingham at TAC 2008. In Proceedings of the First Text...
    • Bossard, A., M. Généreux, and T. Poibeau. 2008. Description of the LIPN Systems at TAC 2008: Summarizing Information and Opinions. In Proceedings...
    • Brandow, R., K. Mitze, and L. F. Rau. 1995. Automatic Condensation of Electronic Publications by Sentence Selection. Information Processing...
    • Celikyilmaz, Asli, Marcus Thint, and Zhiheng Huang. 2009. A Graph-based Semi-Supervised Learning for Question-Answering. In Proceedings of...
    • Edmundson, H. P. 1969. New Methods in Automatic Extracting. Journal of the Association for Computing Machinery, 2(16):264–285.
    • Erkan, G. and D. R. Radev. 2004. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. Journal of Artificial Intelligence...
    • Lesk, M. 1986. Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from a Ice Cream Cone. In Proceedings...
    • Lin, C-Y. 2004. Rouge: A Package for Automatic Evaluation of Summaries. In Proceedings of the Association for Computational Linguistics, Workshop:...
    • Litvak, M. and M. Last. 2008. Graph-based Keyword Extraction for Single-document Summarization. In Proceedings of the International Conference...
    • Lloret, E., O. Ferrández, R. Muñoz, and M. Palomar. 2008. A Text Summarization Approach under the Influence of Textual Entailment. In Proceedings...
    • Mihalcea, R. and P. Tarau. 2004. TextRank: Bringing Order into Texts. In Proceedings of the Conference on Empirical Methods on Natural Language...
    • Patwardhan, S., S. Banerjee, and T. Pedersen. 2005. SenseRelate::TargetWord: A Generalized Framework for Word Sense Disambiguation. In Proceedings...
    • Plaza, L., A. Diaz, and P. Gervas. 2010. Automatic Summarization of News Using WordNet Concept Graphs. IADIS International Journal on Computer...
    • Reeve, L. H., H. Han, and A. D. Brooks. 2007. The Use of Domain-specific Concepts in Biomedical Text Summarization. Information Processing...
    • Sparck-Jones, K. 1972. A Statistical Interpretation of Term Specificity and its Application in Retrieval. Journal of Documentation, 28(1):11–20.
    • Sparck-Jones, K. 1999. Automatic Summarising: Factors and Directions. The MIT Press.
    • Steinberger, J., M. Poesio, M. A. Kabadjov, and K. Jezek. 2007. Two Uses of Anaphora Resolution in Summarization. Information Processing and...
    • Yoo, I., X. Hu, and I-Y. Song. 2007. A Coherent Graph-based Semantic Clustering and Summarization Approach for Biomedical Literature and a...
    • Zhao, L., L. Wu, and X. Huang. 2009. Using Query Expansion in Graph-based Approach for Query-focused Multi-document Summarization. Information...

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno