Automatic counter-narrative generation for hate speech in Spanish

Arturo Montejo Ráez; María Teresa Martín Valdivia; M. Estrella Vallecillo Rodríguez

Ayuda

Automatic counter-narrative generation for hate speech in Spanish

Autores: Arturo Montejo Ráez , María Teresa Martín Valdivia , M. Estrella Vallecillo Rodríguez
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 71, 2023, págs. 227-245
Idioma: inglés
Títulos paralelos:
- Generación automática de contranarrativas para discursos de odio en español
Enlaces
- Texto completo

Dialnet Métricas: 2 Citas

Resumen
- español
  Este trabajo analiza el uso de modelos lingüísticos para generar automáticamente contranarrativas al discurso del odio en español. A pesar de la existencia de algunos estudios en inglés y otros idiomas, ningún trabajo previo ha explorado este tema centrado en el español. El artículo muestra que el uso de GPT-3 supera a otros modelos en la generación de contranarrativas no ofensivas e informativas incluyendo en ocasiones argumentos convincentes. Hemos utilizado diferentes algoritmos de few-shot learning aplicando varias estrategias de prompting y analizando los resultados para cada una de ellas. Además, se ha puesto a disposición de la comunidad investigadora un nuevo corpus llamado CONAN-SP, que consta de 238 pares de discursos de odio y contranarrativas en español, para facilitar nuevas investigaciones en este ámbito. Estos resultados ponen de relieve el potencial de los modelos del lenguaje para combatir el discurso de odio en español mediante la generación de contranarrativas.
- English
  This paper analyzes the use of language models to automatically generate counter-narratives for hate speech in Spanish. Despite the existence of a few studies in English and other languages, no previous work has explored this topic focused on Spanish. The article shows that the use of GPT-3 outperforms other models in generating non-offensive and informative counter-narratives, which sometimes present compelling arguments. We have used few-shot learning algorithms applying different prompt strategies and analyzing the results for each of them. Additionally, a new corpus called CONAN-SP, which consists of 238 pairs of hate speech and counter-narratives in Spanish, has been made available to the research community to facilitate further investigations in this area. These findings highlight the potential of language models to combat hate speech in Spanish by counter-narrative generation.
Referencias bibliográficas
- Ashida, M. and M. Komachi. 2022. Towards automatic generation of messages countering online hate speech and microaggressions. In Proceedings...
- Bang, Y., S. Cahyawijaya, N. Lee, W. Dai, D. Su, B. Wilie, H. Lovenia, Z. Ji, T. Yu, W. Chung, Q. V. Do, Y. Xu, and P. Fung. 2023. A multitask,...
- Benesch, S. 2014. Countering dangerous speech: New ideas for genocide prevention. Available at SSRN 3686876.
- Brown, T., B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. 2020. Language...
- Cawsey, A. J., R. B. Jones, and J. Pearson. 2000. The evaluation of a personalised health information system for patients with cancer. User...
- Chung, H. W., L. Hou, S. Longpre, B. Zoph, Y. Tay, W. Fedus, E. Li, X. Wang, M. Dehghani, S. Brahma, et al. 2022. Scaling instruction-finetuned...
- Chung, Y.-L., E. Kuzmenko, S. S. Tekiroglu, and M. Guerini. 2019. Conan–counter narratives through nichesourcing: a multilingual dataset of...
- Chung, Y.-L., S. S. Tekiroglu, and M. Guerini. 2021. Towards knowledgegrounded counter narrative generation for hate speech. arXiv preprint arXiv:2106.11783.
- Djuric, N., J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati. 2015. Hate speech detection with comment embeddings. In...
- Doddington, G. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second...
- Fandiño, A. G., J. A. Estape, M. Pamies, J. L. Palao, J. S. Ocampo, C. P. Carrino, C. A. Oller, C. R. Penagos, A. G. Agirre, and M. Villegas....
- Fanton, M., H. Bonaldi, S. S. Tekiroglu, and M. Guerini. 2021. Human-in-the-loop for data collection: a multi-target counter narrative dataset...
- Fortuna, P., M. Domınguez, L. Wanner, and Z. Talat. 2022. Directions for nlp practices applied to online hate speech detection. In Proceedings...
- Fortuna, P. and S. Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), 51(4):1–30.
- Frenda, S., B. Ghanem, M. Montes-y Gomez, and P. Rosso. 2019. Online hate speech against women: Automatic identification of misogyny and sexism...
- Gu, Y., X. Han, Z. Liu, and M. Huang. 2022. PPT: Pre-trained prompt tuning for few-shot learning. In Proceedings of the 60th Annual Meeting...
- Hangartner, D., G. Gennaro, S. Alasiri, N. Bahrich, A. Bornhoft, J. Boucher, B. B. Demirci, L. Derksen, A. Hall, M. Jochum, et al. 2021. Empathy-based...
- Lin, C.-Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
- Mathew, B., R. Dutt, P. Goyal, and A. Mukherjee. 2019. Spread of hate speech in online social media. In Proceedings of the 10th ACM conference...
- Mathew, B., N. Kumar, P. Goyal, A. Mukherjee, et al. 2018. Analyzing the hate and counter speech accounts on twitter. arXiv preprint arXiv:1812.02712. Mathew,...
- OpenAI. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
- Papineni, K., S. Roukos, T. Ward, and W.- J. Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of...
- Plaza-Del-Arco, F.-M., M. D. Molina- Gonzalez, L. A. Ureña-Lopez, and M. T. Martın-Valdivia. 2020. Detecting misogyny and xenophobia in spanish...
- Plaza-del Arco, F. M., M. D. Molina- Gonzalez, L. A. Urena-Lopez, and M. T. Martın-Valdivia. 2021. Comparing pretrained language models for...
- Qian, J., A. Bethke, Y. Liu, E. Belding, and W. Y. Wang. 2019. A benchmark dataset for learning to intervene in online hate speech. arXiv...
- Radford, A., J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog,...
- Richards, R. D. and C. Calvert. 2000. Counterspeech 2000: A new look at the old remedy for bad speech. BYU L. Rev., page 553.
- Scao, T. L., A. Fan, C. Akiki, E. Pavlick, S. Ilic, D. Hesslow, R. Castagne, A. S. Luccioni, F. Yvon, M. Galle, et al. 2022. Bloom: A 176b-parameter...
- Tekiroglu, S. S., Y.-L. Chung, and M. Guerini. 2020. Generating counter narratives against online hate speech: Data and strategies. arXiv...
- Touvron, H., T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Roziere, N. Goyal, E. Hambro, F. Azhar, et al. 2023. Llama:...