Interpretación de gestos en tiempo real empleando GestureNet en un robot social

Jesus García Martínez; Juan José Gamboa Montero; José Carlos Castillo Montoya; Álvaro Castro González; Miguel Ángel Salichs Sánchez-Caballero

Ayuda

Interpretación de gestos en tiempo real empleando GestureNet en un robot social

García Martínez, Jesús ^[1] ; Gamboa-Montero, Juan José ^[1] ; Castillo, José Carlos ^[1] ; Castro-González, Álvaro ^[1] ; Salichs, Miguel Ángel ^[1]
1. [1] Universidad Carlos III de Madrid
  
  Universidad Carlos III de Madrid
  
  Madrid, España
Localización: Jornadas de Automática, ISSN-e 3045-4093, Nº. 45, 2024
Idioma: español
DOI: 10.17979/ja-cea.2024.45.10819
Títulos paralelos:
- Real-Time Gesture Interpretation Using GestureNet in a Social Robot
Enlaces
- Texto completo
Resumen
- español
  Este artículo presenta el desarrollo y la integración de un clasificador de gestos manuales en un robot social, con el objetivo de mejorar la comunicación visual durante la interacción humano-robot. Además de las capacidades actuales del robot para escuchar la voz del usuario y recibir comandos táctiles a través de una tableta auxiliar, se ha implementado la capacidad de interpretar gestos visuales. Estos gestos incluyen afirmaciones y negaciones con la mano, así como la mano cerrada y abierta, entre otros. Se ha generado un conjunto de datos para entrenar el modelo de clasificación, y utilizamos una arquitectura diseñada específicamente para este propósito. Como caso de uso del clasificador, se ha desarrollado una aplicación del juego tradicional de piedra, papel o tijera. En dicho juego, durante la interacción con el usuario, el modelo de clasificación se ejecuta en tiempo real. Tanto el módulo de detección como la habilidad de juego se han integrado completamente en la arquitectura del robot, proporcionando una experiencia de usuario fluida y natural a través de este canal de comunicación.
- English
  This paper presents the development and integration of a hand gesture classifier in a social robot, aiming to enhance visual communication during human-robot interaction. In addition to the robot’s current capabilities to listen to the user’s voice and receive touch commands through an auxiliary tablet, the ability to interpret visual gestures has been implemented. These gestures include hand signals for affirmation and negation, as well as open and closed hands. A dataset was generated to train the classification model, and we utilized a specifically designed architecture for this purpose. An application for the traditional game of rock, paper, and scissors was developed as a use case for the classifier. In this game, the classification model runs in realtime during user interaction. The detection module and the application have been fully integrated into the robot’s architecture,providing a smooth and natural user experience through this communication channel.
Referencias bibliográficas
- Andronas, D., Apostolopoulos, G., Fourtakas, N., Makris, S., 2021. Multimodal interfaces for natural human-robot interaction. Procedia Manufacturing...
- Borrero, J., Arrojo Fuentes, G. A., García, J., Castillo, J. C., Castro-Gonz ́alez, A., Salichs, M. Á., 2023. Implementación del juego pares...
- Boyd, A., Czajka, A., Bowyer, K., 2019. Deep learning-based feature extraction in iris recognition: Use existing models, fine-tune or train...
- Chen, L., Wang, K., Li, M., Wu, M., Pedrycz, W., Hirota, K., 2022. K-means clustering-based kernel canonical correlation analysis for multimodal...
- Fitas, R., Rocha, B., Costa, V., Sousa, A., 2021. Design and comparison of image hashing methods: A case study on cork stopper unique identification....
- Himami, Z. R., Bustamam, A., Anki, P., 2021. Deep learning in image classification using dense networks and residual networks for pathologic...
- Kanda, T., Ishiguro, H., 2017. Human-robot interaction in social robotics. CRC Press. DOI: https://doi.org/10.1201/b13004
- Mudduluru, S., Maryada, S. K. R., Booker, W. L., Hougen, D. F., Zheng, B., Improving medical image segmentation and classification using a...
- Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A. Y., et al., 2009. Ros: an open-source robot operating...
- Rosebrock, A., PhD, D. H., MSc, D. M., Thanki, A., Paul, S., 2019. Raspberry pi for computer vision: Hobbyist bundle-v1. 0.1. Baltimore, MD:...
- Salichs, M. A., Castro-González, ́A., Salichs, E., Fernández-Rodicio, E., Maroto-Gómez, M., Gamboa-Montero, J. J., Marques-Villarroya, S.,...
- Shrestha, S., Zha, Y., Banagiri, S., Gao, G., Aloimonos, Y., Fermuller, C., 2024. Natsgd: A dataset with speech, gestures, and demonstrations...
- Torrey, L., Shavlik, J., 2010. Transfer learning. In: Handbook of research on machine learning applications and trends: algorithms, methods,...
- Vrbancic, G., Podgorelec, V., 2020. Transfer learning with adaptive fine-tuning. IEEE Access 8, 196197–196211. DOI: https://doi.org/10.1109/ACCESS.2020.3034343
- Zhou, Y., Kornher, T., Mohnke, J., Fischer, M. H., 2021. Tactile interaction with a humanoid robot: Effects on physiology and subjective impressions....