Anotando la confiabilidad para mejorar la tarea de detección de desinformación: esquema de anotación, recurso y evaluación

Estela Saquete Boró; Patricio Martínez Barco; Alba Bonet Jover; Robiert Sepúlveda Torres

Ayuda

Anotando la confiabilidad para mejorar la tarea de detección de desinformación: esquema de anotación, recurso y evaluación

Autores: Estela Saquete Boró , Patricio Martínez Barco , Alba Bonet Jover, Robiert Sepúlveda Torres
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 70, 2023, págs. 15-26
Idioma: varios idiomas
Títulos paralelos:
- Annotating reliability to enhance disinformation detection: annotation scheme, resource and evaluation
Enlaces
- Texto completo
Resumen
- español
  La desinformación es un problema crítico en nuestra sociedad. La pandemia de covid19 y la guerra entre Rusia y Ucrania han sido escenarios clave para la difusión de noticias falsas. Partiendo de la base de que las noticias falsas mezclan información confiable y no confiable, proponemos RUNAS (Reliable and Unreliable Annotation Scheme), un esquema de anotación de grano fino que etiqueta las partes estructurales y los elementos de contenido esenciales de una noticia y permite clasificarlos en Confiable y No confiable. Esta anotación será usada en el entrenamiento de sistemas para la clasificación automática de la confiabilidad de una noticia. Para ello, se construyó el corpus RUN en español y se anotó con RUNAS. Se llevó a cabo un conjunto de experimentos para validar el esquema de anotación. Los experimentos evidencian la validez del esquema de anotación propuesto, obteniendo el mejor F1m 0,948.
- English
  Disinformation is a critical problem in our society. The COVID-19 pandemic and the Russia-Ukraine war have been key events for the spreading of fake news. Assuming that fake news mixes reliable and unreliable information, we propose RUN-AS (Reliable and Unreliable Annotation Scheme), a fine-grained annotation scheme that labels the structural parts and essential content elements of a news item to enable their classification into Reliable and Unreliable. This type of annotation will be used for training systems to automatically classify the reliability of a news item. To this end, RUN dataset in Spanish was built and annotated with RUN-AS. A set of experiments were conducted to validate the annotation scheme.
  
  The experiments evidence the validity of the annotation scheme proposed, obtaining the best F1m, i.e., 0.948.
Referencias bibliográficas
- Assaf, R. and M. Saheb. 2021. Dataset for arabic fake news. In 2021 IEEE 15th International Conference on Application of...
- Bergmeir, C. and J. M. Benıtez. 2012. On the use of cross-validation for time series predictor evaluation. Information Sciences,...
- Canete, J., G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. Perez. 2020. Spanish pre-trained bert model and evaluation data....
- Chakma, K. and A. Das. 2018. A 5w1h based annotation scheme for semantic role labeling of english tweets. Computacion y Sistemas, 22(3):747–755.
- Chakma, K., S. D. Swamy, A. Das, and S. Debbarma. 2020. 5w1h-based semantic segmentation of tweets for event detection using...
- DeAngelo, T. I. and N. S. Yegiyan. 2019. Looking for efficiency: How online news structure and emotional tone influence processing...
- Ferreira, W. and A. Vlachos. 2016. Emergent: a novel data-set for stance classification. In Proceedings of the 2016 Conference...
- Figueira, A . and L. Oliveira. 2017. The current state of fake news: challenges and opportunities. Procedia Computer Science,121:817–825.
- Giansiracusa, N. 2021. How Algorithms Create and Prevent Fake News. Springer.
- Gruppi, M., B. D. Horne, and S. Adali. 2018. An exploration of unreliable news classification in brazil and the us. arXiv preprint arXiv:1806.02875.
- Hamborg, F., C. Breitinger, M. Schubotz, S. Lachnit, and B. Gipp. 2018. Extraction of main event descriptors from news articles...
- Horne, B. and S. Adali. 2017. This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar...
- Khodra, M. L. 2015. Event extraction on indonesian news article using multiclass categorization. In 2015 2nd International...
- Mottola, S. 2020. Las fake news como fenómeno social. análisis lingu¨ıstico y poder persuasivo de bulos en italiano y espan˜ol....
- Norambuena, B., M. Horning, and T. Mitra. 2020. Evaluating the inverted pyramid structure through automatic 5w1h extraction and...
- Paka, W. S., R. Bansal, A. Kaushik, S. Sengupta, and T. Chakraborty. 2021. Crosssean: A cross-stitch semi-supervised neural attention...
- Patwa, P., S. Sharma, S. Pykl, V. Guptha, G. Kumari, M. S. Akhtar, A. Ekbal, A. Das, and T. Chakraborty. 2021. Fighting an infodemic:...
- Perez-Rosas, V., B. Kleinberg, A. Lefevre, and R. Mihalcea. 2017. Automatic detection of fake news. arXiv preprint arXiv:1708.07104.
- Posadas-Durán, J.-P., H. Gomez-Adorno, G. Sidorov, and J. J. M. Escobar. 2019. Detection of fake news in a new corpus for the...
- Rashkin, H., E. Choi, J. Y. Jang, S. Volkova, and Y. Choi. 2017. Truth of varying shades: Analyzing language in fake news and...
- Saquete, E., D. Tomas, P. Moreda, P. Martínez Barco, and M. Palomar.2020. Fighting post-truth using natural...
- Sepúlveda-Torres, R., E. Saquete Boro, et al.2021. Gplsi team at checkthat! 2021: Finetuning beto and roberta. CEUR. Shahi, G....
- Shao, C., G. L. Ciampaglia, O. Varol, A. Flammini, and F. Menczer. 2017. The spread of fake news by social bots. arXiv preprint...
- Shu, K., S. Wang, D. Lee, and H. Liu.2020. Mining disinformation and fake news: Concepts, methods, and recent advancements....
- Silva, R. M., R. L. Santos, T. A. Almeida, and T. A. Pardo. 2020. Towards automatically filtering fake news in portuguese. Expert...
- Thomson, E. A., P. R. White, and P. Kitley. 2008. “objectivity” and “hard news” reporting across cultures: Comparing the news...
- Vieira, S. M., U. Kaymak, and J. M. Sousa.2010. Cohen’s kappa coefficient as a performance measure for feature selection. In International...
- Vlachos, A. and S. Riedel. 2014. Fact checking: Task definition and dataset construction. In Proceedings of the ACL 2014 workshop...
- Vosoughi, S., D. Roy, and S. Aral. 2018. The spread of true and false news online. science, 359(6380):1146–1151.
- Wang, W. Y. 2017. ” liar, liar pants on fire”: A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648.
- Wolf, T., L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, and J. Brew. 2019. Huggingface’s...
- Zhang, A. X., A. Ranganathan, S. E. Metz, S. Appling, C. M. Sehat, N. Gilmore, N. B. Adams, E. Vincent, J. Lee, M. Robbins, et al. ...
- Zhou, X. and R. Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys...