AMIC: affective multimedia analytics with inclusive and natural communication

María Inés Torres Barañano; Raquel Justo Blanco; Alfonso Ortega Giménez; Eduardo Lleida Solano; Rubén San Segundo Hernández; Javier Ferreiros López; Lluís Felip Hurtado Oliver; Emilio Sanchís Arnal

Ayuda

AMIC: affective multimedia analytics with inclusive and natural communication

Autores: María Inés Torres Barañano , Raquel Justo Blanco , Alfonso Ortega Giménez , Eduardo Lleida Solano , Rubén San Segundo Hernández , Javier Ferreiros López , Lluís Felip Hurtado Oliver , Emilio Sanchís Arnal
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 61, 2018, págs. 147-150
Idioma: inglés
Títulos paralelos:
- AMIC: análisis afectivo de información multimedia con comunicación inclusiva y natural
Enlaces
- Texto completo
Resumen
- español
  Tradicionalmente, el análisis de los contenidos textuales ha sido la principal fuente de extracción y catalogación de contenidos multimedia y a él se han ido sumando tecnologías que son capaces de extraer información del audio y del video. Un nuevo eje de análisis es la vertiente emocional-afectiva intrínseca en la comunicación humana. Esta información de emociones, posiciones, preferencias, lenguaje figurativo, ironía, sarcasmo, etc. Es fundamental para una comprensión total del contenido de conversaciones, discursos, debates, etc. El objetivo de este proyecto se centra en avanzar en el desarrollo y mejora de prestaciones de las tecnologías del habla, el lenguaje, la imagen y el vídeo para el análisis de contenidos multimedia y añadir a este análisis la extracción de información afectiva-emocional. Como pasos adicionales, se avanzará en los métodos de presentación de resultados al usuario, trabajando en tecnologías de simplificación del lenguaje, generación automática de resúmenes e informes, síntesis de voz emocional e interacción natural e inclusiva.
- English
  Traditionally, textual content has been the main source of information extraction and indexing, and other technologies that are capable of extracting information from the audio and video of multimedia documents have joined later. Other major axis of analysis is the emotional and affective aspect intrinsic in human communication. This information of emotions, stances, preferences, figurative language, irony, sarcasm, etc. is fundamental and irreplaceable for a complete understanding of the content in conversations, speeches, debates, discussions, etc. The objective of this project is focused on advancing, developing and improving speech and language technologies as well as image and video technologies in the analysis of multimedia content adding to this analysis the extraction of affective-emotional information. As additional steps forward, we will advance in the methodologies and ways for presenting the information to the user, working on technologies for language simplification, automatic reports and summary generation, emotional speech synthesis and natural and inclusive interaction
Referencias bibliográficas
- Amodei, D., S. Ananthanarayanan, R. Anubhai, , J. Bai, E. Battenberg, C. Case and J. Chen. 2016. Deep speech 2: End-to-end speech recognition...
- Deng, L. 2016. Deep learning: from speech recognition to language and multimodal processing. APSIPA Transactions on Signal and Information...
- Ferreiros, J., J.M. Pardo, L.F. Hurtado, E. Segarra, A. Ortega, E. Lleida, M.I. Torres, and R. Justo, 2016. ASLP-MULAN: Audio speech and language...
- García P., E. Lleida, D. Castán, J.M. Marcos, and D. Romero, 2015. Context-Aware Communicator for All. In Universal Access in Human-Computer...
- Hinton, G., L. Deng, D. Yu, G.E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen and T.N. Sainath 2012. Deep neural networks...
- Hurtado, L., E. Segarra, F. Pla, P. Carrasco and J.A. González 2017. ELiRF-UPV at SemEval-2017 Task 7: Pun Detection and Interpretation. In...
- Justo, R., T. Corcoran, S. Lukin, M. Walker and M.I. Torres 2014. Extracting relevant knowledge for the detection of sarcasm and nastiness...
- Lorenzo-Trueba J., R. Barra-Chicote, R. San-Segundo, J. Ferreiros, J. Yamagishi and J.M. Montero 2015. Emotion Transplantation through Adaptation...
- Martinez-González, B., J.M. Pardo, R. San-Segundo, and J.M. Montero 2016. Influence of Transition Cost in the Segmentation Stage of Speaker...
- Miguel, A., J. Llombart, A. Ortega, and E. Lleida 2017 Tied Hidden Factors in Neural Networks for End-to-End Speaker Recognition. In Proc....
- Mikolov, T., K. Chen, G. Corrado, and J. Dean 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
- Viñals, I., A. Ortega, J. Villalba, A. Miguel and E. Lleida 2017. Domain Adaptation of PLDA models in Broadcast Diarization by means of Unsupervised...
- Zhang, K., W.L. Chao, F. Sha and K. Grauman 2016. Video Summarization with Long Short-term Memory, arXiv:1605.08110v2.