A comparison of small sample methods for Handshape Recognition

Franco Ronchetti; Facundo Quiroga; Ulises Jeremias Cornejo Fandos; Gastón Gustavo Rios; Pedro Dal Bianco; Waldo Hasperué; Laura Cristina Lanzarini

Ayuda

A comparison of small sample methods for Handshape Recognition

Franco Ronchetti ^[1] ; Facundo Quiroga ^[1] ; Ulises Jeremias Cornejo Fandos ^[1] ; Gastón Gustavo Rios ^[1] ; Pedro Dal Bianco ^[1] ; Waldo Hasperué ^[1] ; Laura Lanzarini ^[1]
1. [1] Universidad Nacional de La Plata
  
  Universidad Nacional de La Plata
  
  Argentina
Localización: Journal of Computer Science and Technology, ISSN-e 1666-6038, Vol. 23, Nº. 1, 2023
Idioma: inglés
DOI: 10.24215/16666038.23.e03
Títulos paralelos:
- Una Comparación de Métodos para Reconocimiento de Formas de Manos con Pocas Muestras
Enlaces
- Texto completo
Resumen
- español
  Los sistemas de traducción automática de lengua de señas (SLT, por sus siglas en inglés) pueden ser una gran ayuda para mejorar la comunicación con las comunidades sordas así como también entre ellas. Actualmente, el principal obstáculo para el desarrollo de modelos de traducción efectivos es la falta de datos etiquetados, que impide el uso de métodos de aprendizaje automático profundo modernos. La traducción de lengua de señas es un problema complejo que involucra varias subtareas, de las cuales el reconocimiento de la forma de la mano es la más importante. En este trabajo, comparamos una serie de modelos especialmente adaptados para ser entrenados con pocas muestras en la tarea de reconocer formas de mano. Evaluamos los modelos Wide-DenseNet y Prototypical Networks, con y sin el uso de transferencia de aprendizaje, y también el model Model-Agnostic Meta-Learning (MAML). Nuestros resultados indican que el modelo Wide-DenseNet sin transferencia de aprendizaje y las Prototypical Networks con transferencia de aprendizaje obtienen los mejores resultados. Las Prototypical Networks son vastamente superiores cuando se utilizan menos de 30 muestras, mientras que Wide-DenseNet es superior en el resto de los casos. Por otro lado, MAML, que es un método diseñado específicamente para estos casos, no mejora el desempeño en ningún caso. Estos resultados pueden ayudar a diseñar mejor los componentes de un sistema de traducción de lengua de señas.
- English
  Automatic Sign Language Translation (SLT) systems can be a great asset to improve the communication with and within deaf communities. Currently, the main issue preventing effective translation models lays in the low availability of labelled data, which hinders the use of modern deep learning models.
  
  SLT is a complex problem that involves many subtasks, of which handshape recognition is the most important. We compare a series of models specially tailored for small datasets to improve their performance on handshape recognition tasks. We evaluate Wide-DenseNet and few-shot Prototypical Network models with and without transfer learning, and also using Model-Agnostic Meta-Learning (MAML).
  
  Our findings indicate that Wide-DenseNet without transfer learning and Prototipical Networks with transfer learning provide the best results. Prototypical networks, particularly, are vastly superior when using less than 30 samples, while Wide-DenseNet achieves the best results with more samples. On the other hand, MAML does not improve performance in any scenario. These results can help to design better SLT models.
Referencias bibliográficas
- O. Koller, “Quantitative survey of the state of the art in sign language recognition,” CoRR, vol. abs/2008.09918, 2020.
- D. Bragg, O. Koller, M. Bellard, L. Berke, P. Boudreault, A. Braffort, N. Caselli, M. Huenerfauth, H. Kacorri, T. Verhoef, et al., “Sign language...
- A. A. I. Sidig, H. Luqman, and S. A. Mahmoud, “Arabic sign language recognition using vision and hand tracking features with hmm,” International...
- W. Min, W. Ya, and Z. Xiao-Juan, “An improved adaptation algorithm for signer-independent sign language recognition,” International Journal...
- O. Koller, H. Ney, and R. Bowden, “Deep hand: How to train a cnn on 1 million hand images when your data is continuous and weakly labelled,”...
- J. Snell, K. Swersky, and R. S. Zemel, “Prototypical networks for few-shot learning,” CoRR, vol. abs/1703.05175, 2017.
- G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference...
- C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” CoRR, vol. abs/1703.03400, 2017.
- O. Koller, H. Ney, and R. Bowden, “Deep hand: How to train a cnn on 1 million hand images when your data is continuous and weakly labelled,”...
- F. Ronchetti, F. Quiroga, L. Lanzarini, and C. Estrebou, “Handshape recognition for argentinian sign language using probsom,” Journal of Computer...
- F. Quiroga, R. Antonio, F. Ronchetti, L. C. Lanzarini, and A. Rosete, “A study of convolutional architectures for handshape recognition applied...
- D. N´u˜nez Fern´andez and B. Kwolek, “Hand posture recognition using convolutional neural network,” in Progress in Pattern Recognition, Image...
- A. A. Alani, G. Cosma, A. Taherkhani, and T. M. McGinnity, “Hand gesture recognition using an adapted convolutional neural network with data...
- A. Tang, K. Lu, Y. Wang, J. Huang, and H. Li, “A real-time hand posture recognition system using deep neural networks,” ACM Transactions on...
- P. Barros, S. Magg, C. Weber, and S. Wermter, “A multichannel convolutional neural network for hand posture recognition,” in International...
- S. Ameen and S. Vadera, “A convolutional neural network to classify american sign language fingerspelling from depth and colour images,” Expert...
- U. J. Cornejo Fandos, G. G. Rios, F. Ronchetti, F. Quiroga, W. Hasperu´e, and L. C. Lanzarini, “Recognizing handshapes using small datasets,”...
- C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep transfer learning,” in Artificial Neural Networks and Machine Learning...
- H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean, “Efficient neural architecture search via parameters sharing,” in Proceedings of the 35th International...
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition...
- J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141,...
- A. Farhadi, D. Forsyth, and R. White, “Transfer learning in sign language,” in 2007 IEEE Conference on Computer Vision and Pattern Recognition,...
- U. Cˆot´e-Allard, C. L. Fall, A. Campeau-Lecours, C. Gosselin, F. Laviolette, and B. Gosselin, “Transfer learning for semg hand gestures recognition...
- K. Weiss, T. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big Data, vol. 3, 12 2016.
- A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny images,” tech. rep., CIFAR, 2009.
- Y. LeCun and C. Cortes, “MNIST handwritten digit database,” tech. rep., MNIST, 2010.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014. cite arxiv:1412.6980Comment: Published as a conference paper at...