Reconocimiento facial en informativos televisivos mediante redes convolucionales profundas

Ricardo Asensi González; Pedro Javier Herrera Caro

Ayuda

Reconocimiento facial en informativos televisivos mediante redes convolucionales profundas

Asensi-González, Ricardo ^[1] ; Herrera, Pedro Javier ^[1]
1. [1] Universidad Nacional de Educación a Distancia
  
  Universidad Nacional de Educación a Distancia
  
  Madrid, España
Localización: Jornadas de Automática, ISSN-e 3045-4093, Nº. 46, 2025
Idioma: español
DOI: 10.17979/ja-cea.2025.46.12046
Títulos paralelos:
- Face recognition in television news images using deep convolutional networks
Enlaces
- Texto completo
Resumen
- español
  Este trabajo propone un sistema de inteligencia artificial basado en redes neuronales profundas que permite la detección y reconocimiento de personas concretas en imágenes extraídas de informativos televisivos. Para ello, se ha creado un conjunto de datos (dataset) que consta de 12800 imágenes, centrado principalmente en figuras políticas de ámbito nacional. El sistema propuesto realiza la detección del individuo en la escena de manera automática utilizando la red YOLOv8 y, posteriormente, realiza su reconocimiento a partir del clasificador que proporcione mayor certidumbre. Para ello, se compararon siete arquitecturas de red neuronal convenientemente adaptadas a esta problemática concreta: VGG-16, VGG-19, InceptionV3, Xception, ResNet-101, MobileNetV2 y DenseNet-169, siendo este último el modelo que obtiene en promedio un mejor desempeño en todas las pruebas realizadas. Los resultados confirman la viabilidad del sistema y permiten sentar las bases para futuras investigaciones.
- English
  This work proposes an artificial intelligence system based on deep neural networks that enables the detection and recognition of specific persons in images extracted from television news. To this end, a dataset consisting of 12800 images was created, focusing primarily on Spanish political figures. The proposed system automatically detects the individual in the scene using the YOLOv8 network and subsequently recognizes the individual using the classifier that provides the greatest certainty. To this end, seven neural network architectures appropriately adapted to this specific problem were compared: VGG-16, VGG-19, InceptionV3, Xception, ResNet-101, MobileNetV2, and DenseNet-169, with the latter model achieving the best average performance across all tests. The results confirm the viability of the system and lay the groundwork for future research.
Referencias bibliográficas
- Asensi-González, R., 2024. Reconocimiento del rostro humano en imágenes de informativos televisivos mediante redes convolucionales profundas,...
- Bledsoe, W. W., 1963. A study to determine the feasibility of a simplified face recognition machine. Panoramic Research, Inc. Palo Alto, California.
- Bledsoe, W. W., 1964. Facial recognition project. Panoramic research, Inc. Palo Alto, California.
- Bledsoe, W. W., 1966. Man-machine facial recognition: report on a large-scale experiment. Technical Report PRI 22, Panoramic Research, Inc....
- Boutrus, F., Damer, N., Fang, M., Kirchbuchner, F. Kuijper, A., 2021. MixFaceNets: extremely efficient face recognition networks. IEEE International...
- Chen, S., Liu, Y., Gao, X, Han, Z., 2018. MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices. In: Zhou,...
- Chollet, F., 2017. Xception: deep learning with depthwise separable convolutions. arXiv. DOI: 10.48550/arXiv.1610.02357
- Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L., 2009. ImageNet: a large-scale hierarchical image database. IEEE Conference...
- Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A, 2010. The pascal visual object classes (VOC) challenge. International...
- Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A, 2015. The pascal visual object classes challenge:...
- Fukushima, K., 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position....
- Girshick R., Donahue, J., Darrell, T., Malik, J., 2013. R-CNN rich feature hierarchies for accurate object detection and semantic segmentation....
- Girshick, R., 2014. Fast R-CNN. IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 1440-1448, DOI: 10.1109/ICCV.2015.169
- Goldstein, A.J, Harmon, L. D., Lesk, A.B., 1971. Identification of human faces. In: Proceedings of the IEEE, vol. 59, no. 5, pp. 748-760....
- Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J., 2016. MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas,...
- He, K, Zhang, X, Ren, A., Sun, J., 2015. Deep residual learning for image recognition. arXiv. DOI: 10.48550/arXiv.1512.03385
- He, K., Gkioxari, G., Dollár, P., Girshick, R., 2017. Mask R-CNN. IEEE International Conference on Computer Vision (ICCV), Venice, Italy,...
- Howard, A., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T. Andreetto, A., 2017. MobileNets: efficient convolutional neural networks...
- Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q., 2017. Densely connected convolutional networks. IEEE Conference on Computer Vision...
- Huang, G. B., Ramesh, M., Berg, T., Learned-Miller. E., 2007. Labeled faces in the wild: a database for studying face recognition in unconstrained...
- Jocher, G., Qiu, J., Chaurasia, A, 2023. Ultralytics YOLO (Version 8.0.0). https://github.com/ultralytics/ultralytics (Accedido 30 abril 2025).
- Krizhevsky, A., Sutskever, I. Hinton, G.E., 2012. ImageNet classification with deep convolutional neural networks. Neural Information Processing...
- LeCun, Y., Boser, B., Denker, J. S., Howard, R. E., Habbard, W., Jackel, L. D., Henderson, D., 1990. Handwritten digit recognition with a...
- Li, J., Wang, Y., Wan, C., Tai, Y., Qian, J., Yang, J., Wang, C., 2019. DSFD: dual shot face detector. IEEE/CVF Conference on Computer Vision...
- Lin, T. -Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., Dollár, P., 2015. Microsoft...
- Nech, A., Kemelmacher-Shlizerman, I., 2017. Level playing field for million scale face recognition. In: IEEE Conference on Computer Vision...
- Pajares, G., Herrera, P. J., Besada, E., 2021. Aprendizaje profundo. RC Libros Editorial, Madrid.
- Ren S., He K., Girshick, R., Sun J., 2015. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings...
- Schroff, F., Kalenichenko, D., Philbin, J., 2015. FaceNet: a unified embedding for face recognition and clustering. In: IEEE Conference on...
- Simonyan, K. Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on...
- Sirovich, L., Kirby, M., 1987. Low-dimensional procedure for the characterization of human faces. Journal of the Optical Society of America...
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., 2014. Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition...
- Tang, X., Du, D. K., He, Z, Liu, J., 2018. PyramidBox: a context-assisted single shot face detector. In: 15th European Conference on Computer...
- Turk, M., Pentland, A., 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience 3, 71-86. DOI: 10.1162/jocn.1991.3.1.71
- Viola, P, Jones, M., 2001. Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society...
- Wolf, L., Hassner, T., Maoz, I., 2011. Face recognition in unconstrained videos with matched background similarity. In: IEEE Conf. on Computer...
- Zeiler, M., Fergus, R., 2014. Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T....