Multimodal Emotional Recognition for Human-Robot Interaction in Social Robotics

Sergio García Muñoz; Francisco Gómez Donoso; Miguel Cazorla Quevedo

Ayuda

Multimodal Emotional Recognition for Human-Robot Interaction in Social Robotics

Sergio García Muñoz ^[1] ; Francisco Gómez Donoso ; Miguel Cazorla Quevedo
1. [1] Universitat d'Alacant
  
  Universitat d'Alacant
  
  Alicante, España
Localización: Proceedings of the XXIV Workshop of Physical Agents: September 5-6, 2024 / coord. por Miguel Cazorla Quevedo , Francisco Gómez Donoso , Félix Escalona Moncholi , 2024, ISBN 978-84-09-63822-2, págs. 220-234
Idioma: español
Enlaces
- Texto completo
Resumen
- This study explores the enhancement of human-robot interaction (HRI) through multimodal emotional recognition within social robotics, using the humanoid robot Pepper as a testbed. Despite the advanced interactive capabilities of robots like Pepper, their ability to accurately interpret and respond to human emotions remains limited.
  
  This paper addresses these limitations by integrating visual, auditory, and textual analyses to improve emotion recognition accuracy and contextual understanding. By leveraging multimodal data, the study aims to facilitate more natural and effective interactions between humans and robots, particularly in assistive, educational, and healthcare settings. The methods employed include convolutional neural networks for visual emotion detection, audio processing techniques for auditory emotion analysis, and natural language processing for text-based sentiment analysis. The results demonstrate that the multimodal approach significantly enhances the robot’s interactive and empathetic capabilities. This paper discusses the specific improvements observed, the challenges encountered, and potential future directions for research in multimodal emotional recognition in HRI.
Referencias bibliográficas
- Cao, H., Cooper, D.G., Keutmann, M.K., Gur, R.C., Nenkova, A., Verma, R.: Crema-d: Crowd-sourced emotional multimodal actors dataset. IEEE...
- Costa, A., Martinez-Martin, E., Cazorla, M., Julian, V.: Pharos—physical assistant robot system. sensors 18(8), 95 – 107 (2018)
- Jackson, P., Haq, S.: Surrey audio-visual expressed emotion (savee) database. University of Surrey: Guildford, UK (2014)
- Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L.,...
- Kanda, T., Miyashita, T., Osada, T., Haikawa, Y., Ishiguro, H.: Analysis of humanoid appearances in human–robot interaction. IEEE transactions...
- Khalaf, A.L., Abdulrahman, M.M., Al_Barazanchi, I.I., Tawfeq, J.F., JosephNg, P.S., Radhi, A.D.: Real time pedestrian and objects detection...
- Liang, C., Lu, J., Yan, W.Q.: Human action recognition from digital videos based on deep learning. In: Proceedings of the 5th International...
- Liu, H., Li, C., Wu, Q., Lee, Y.J.: Visual instruction tuning. Advances in neural information processing systems 36 (2024)
- Livingstone, S.R., Russo, F.A.: The ryerson audio-visual database of emotional speech and song (ravdess): A dynamic, multimodal set of facial...
- Marin, E.G.C., Morales, C.A., Sanchez, E.S., Cazorla, M., Plaza, J.M.C.: Designing a cyber-physical robotic platform to assist speech-language...
- Mejia-Escobar, C., Cazorla, M., Martinez-Martin, E.: A large visual, qualitative, and quantitative dataset for web intelligence applications....
- Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning...
- Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986). https://doi.org/10.1038/323533a0,...
- Tiedemann, J., Thottingal, S.: Opus-mt–building open translation services for the world. In: Proceedings of the 22nd Annual Conference of...
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. arxiv...
- Vryzas, N., Kotsakis, R., Liatsou, A., Dimoulas, C.A., Kalliris, G.: Speech emotion recognition for performance interaction. Journal of the...
- Wu, X., Bartram, L.: Social robots for people with developmental disabilities: a user study on design features of a graphical user interface....