Ir al contenido

Documat


Combining heterogeneous inputs for the development of adaptive and multimodal interaction systems

  • GRIOL, David [1] ; GARCÍA-HERRERO, Jesús [1] ; MOLINA, José Manuel [1]
    1. [1] Carlos III University of Madrid
  • Localización: ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, ISSN-e 2255-2863, Vol. 2, Nº. 6, 2013, págs. 37-53
  • Idioma: inglés
  • DOI: 10.14201/ADCAIJ2014263753
  • Enlaces
  • Resumen
    • In this paper we present a novel framework for the integration of visual sensor networks and speech-based interfaces. Our proposal follows the standard reference architecture in fusion systems (JDL), and combines different techniques related to Artificial Intelligence, Natural Language Processing and User Modeling to provide an enhanced interaction with their users. Firstly, the framework integrates a Cooperative Surveillance Multi-Agent System (CS-MAS), which includes several types of autonomous agents working in a coalition to track and make inferences on the positions of the targets. Secondly, enhanced conversational agents facilitate human-computer interaction by means of speech interaction. Thirdly, a statistical methodology allows modeling the user conversational behavior, which is learned from an initial corpus and improved with the knowledge acquired from the successive interactions. A technique is proposed to facilitate the multimodal fusion of these information sources and consider the result for the decision of the next system action.

  • Referencias bibliográficas
    • Avis, P., Surveillance and Canadian maritime domestic security, Canadian Military Journal, vol. 1, no. 4, pp. 9-15, 2003.
    • Bailly, G., Raidt, S., Elisei, F. Gaze, conversational agents and face-to-face communication. Speech Communication, 52(6), 598-612, 2010.
    • Bangalore, S., G. D. Fabbrizio, and A. Stent. Learning the Structure of Task-Driven Human-Human Dialogs, IEEE Transactions on Audio, Speech,...
    • Baker, J., Deng, L., Glass, J., Khudanpur, S., Lee, C., Morgan, N., O’Shaughnessy, D. Developments and directions in speech recognition and...
    • Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russel, M., Wong, M. Towards multilingual speech recognition using data driven...
    • Benesty, J., Sondhi, M.M., Huang, Y. Springer Handbook of Speech Processing. Springer. 2008.
    • Berger, A., S. Pietra, and V. Pietra. A maximum entropy approach to natural language processing, Comput. Linguist, 22(1), 39-71, 1996.
    • Bohus, D., Rudnicky, A. RavenClaw: Dialog management using hierarchical task decomposition and an expectation agenda. In: Proc. of 8th European...
    • Bricon-Souf N, Newman CR. Context awareness in health care: A review. International journal of medical informatics 76, 2-12, 2007.
    • Cassell, J., Sullivan, J., Prevost, S., Churchill, E.F. Embodied Conversational Agents. The MIT Press, 2000.
    • Castanedo, F., J. García, M. A. Patricio, and J. M. Molina. Data fusion to improve trajectory tracking in a Cooperative Surveillance Multi-Agent...
    • Catizone, R., Setzer, A., Wilks, Y. Multimodal Dialogue Management in the COMIC Project. In: Proc. of EACL’03 Workshop on Dialogue Systems:...
    • Corradini A, Mehta M, Bernsen N, Martin J, Abrilian S. Multimodal input fusion in human-computer interaction. In: Proc. of the NATO-ASI Conference...
    • Cowie, R., Cornelius, R. Describing the emotional states that are expressed in speech. Speech Communication, 40(1-2), 5-32, 2003.
    • Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson A. Towards human-like spoken dialogue systems. Speech Communication, 50 (8-9), 630-645,...
    • Endrass, B., Rehm, M., André, E. Planning Small Talk behavior with cultural influences for multiagent systems. Computer Speech & Language,...
    • Flecha-García, M.L. Eyebrow raises in dialogue and their relation to discourse structure, utterance function and pitch accents in English....
    • Forbes-Riley, K. M., Litman, D. Modelling user satisfaction and student learning in a spoken dialogue tutoring system with generic, tutoring,...
    • Gaver WW. Using and creating auditory icons. SFI studies in the sciences of complexity, Addison Wesley Longman, 1992.
    • Gibbon, D., I. Mertins, and R. K. Moore (Eds.), Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation....
    • Griol, D., L. F. Hurtado, E. Segarra, and E. Sanchis. A statistical Approach to Spoken Dialog Systems Design and Evaluation, Speech Communication,...
    • Griol, D., J. Carbó, and J. M. Molina, Agent Simulation to Develop Interactive and User-Centered Conversational Agents, Advances in Intelligent...
    • Griol, D., J. Molina, and Z. Callejas. Bringing together commercial and academic perspectives for the development of intelligent AmI interfaces,...
    • Haseel L, Hagen E. Adaptation of an automotive dialogue system to users’ expertise. In: Proc. of 9th International Conference on Spoken Language...
    • Heim, J., Nilsson, E. G., Skjetne, J. H. User Profiles for Adapting Speech Support in the Opera Web Browser to Disabled Users. LNCS, 4397,...
    • Heinroth, T. and W. Minker, Introducing Spoken Dialogue Systems into Intelligent Environments. Springer, 2012.
    • Jokinen, K. Natural interaction in spoken dialogue systems. In: Proc. of the Workshop Ontologies and Multilinguality in User Interfaces. Crete,...
    • Lalanne, D., L. Nigay, P. Palanque, P. Robinson, J. Vanderdonckt, and J. Ladry. Fusion engines for multimodal input: a survey, in Proc. of...
    • Langner, B., Black, A. Using speech in noise to improve understandability for elderly listeners. In: Proc. of ASRU’05. San Juan, Puerto Rico,...
    • Lemon, O. and O. Pietquin (Eds.), Data-Driven Methods for Adaptive Spoken Dialogue Systems. Computational Learning for Conversational Interfaces....
    • Lech, T. and L. W. M. Wienhofen, AmbieAgents: A Scalable Infrastructure for Mobile and Context-Aware Information Services. In: Proc. of AAMAS'05,...
    • Levin E, Levin A. Dialog design for user adaptation. In: Proc. of the International Conference on Acoustics Speech Processing, Toulouse, France,...
    • Liggins, M., Hall, D., and Llinas, J. Handbook of Multisensor Data Fusion (2nd Edition). Boca Ratón, Florida, USA: CRC Press, 2009.
    • Lo, B.P. J. Sun, and S. A. Velastin. Fusing visual and audio information in a distributed intelligent surveillance system for public transport...
    • López-Cózar, R. and M. Araki. Spoken, Multilingual and Multimodal Dialogue Systems. John Wiley & Sons Publishers, 2005.
    • López-Cózar, R., and Callejas, Z. ASR post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information....
    • Markopoulos P, de Ruyter B, Privender S, van Breemen A. Case study: bringing social intelligence into home dialogue systems. Interactions,...
    • Martinovski, B., Traum, D. Breakdown in human-machine interaction: the error is the clue. In: Proc. of the ISCA Tutorial and Research Workshop...
    • McCarthy, J. Generality in Artificial Intelligence. Communications of the ACM, 30(12), 1030-1035, 1987.
    • Minker, W. Stochastic versus rule-based speech understanding for information retrieval. Speech Communication 25(4), 223-247, 1998.
    • Minker, W. Design considerations for knowledge source representations of a stochastically-based natural language understanding component....
    • Nazari AA. A Generic UPnP Architecture for Ambient Intelligence Meeting Rooms and a Control Point allowing for Integrated 2D and 3D Interaction....
    • Nigay L, Coutaz J. A generic platform for addressing the multimodal challenge. In: Proc. of the SIGCHI Conference on Human Factors in Computing...
    • Osland, P., B. Viken, F. Solsvik, G. Nygreen, J. Wedvik, and S. Myklbust, Enabling Context-Aware Applications, In: Proc. of ICIN'06, 1-6,...
    • Pieraccini, R. The Voice in the Machine: Building Computers that Understand Speech. The MIT Press, 2012.
    • Rabiner, L., Juang, B. Fundamentals of Speech Recognition. Prentice Hal, 1993.
    • Prendinger, H., Mayer, S., Mori, J., Ishizuka, M. Persona effect revisited. Using bio-signals to measure and reflect the impact of character-based...
    • Radford, L. Gestures, Speech, and the Sprouting of Signs: A Semiotic-Cultural Approach to Students' Types of Generalization. Mathematical...
    • Raux, A., Langner, B., Black, A. W., Eskenazi, M. LET’S GO: Improving Spoken Dialog Systems for the Elderly and Non-natives. In: Proc. of...
    • Salovey, P., Mayer, J.D. Emotional intelligence. Imagination, Cognition, and Personality, 9, 185-211, 1990.
    • Sánchez, A.M., M. Patricio, J. García, and J. M. Molina. Video tracking improvement using context-based information. In: Proc. of 10th Int....
    • Schatzmann, J., K. Weilhammer, M. Stuttle, and S. Young. A Survey of Statistical User Simulation Techniques for Reinforcement-Learning of...
    • Schuller, B., Batliner, A., Steidl, S., Seppi, D. Recognising Realistic Emotions and Affect in Speech: State Of The Art and Lessons Learnt...
    • Seneff, S., M. Adler, J. Glass, B. Sherry, T. Hazen, C.Wang, and T.Wu. Exploiting Context Information in Spoken Dialogue Interaction with...
    • Strauss, P. and W. Minker. Proactive Spoken Dialogue Interaction in Multi-Party Environments. Springer, 2010.
    • Traum, D., Larsson, S. Current and New Directions in Discourse and Dialogue, chap. The Information State Approach to Dialogue Management,...
    • Tsilfidis, A., Mporas, I., Mourjopoulos, J., and Fakotakis, N. Automatic speech recognition performance in different room acoustic environments...
    • Wahlster, W. Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression. In: Proc. of the 26th German Conference...
    • Weber, M.E. and M. L. Stone. Low altitude wind shear detection using airport surveillance radars. In: Record of IEEE Radar Conference, 52-57,...
    • Williams, J., Young, S. Scaling POMDPs for Spoken Dialog Management. IEEE Audio, Speech and Language Processing 15(8), 2116-2129, 2007.
    • Wooldridge, M. and N. R. Jennings. Surveillance and Canadian maritime domestic security. The Knowledge Engineering Review, 10(2), 115-152,...
    • Wu, L., S. L. Oviatt, and P. R. Cohen. From members to teams to committee-a robust approach to gestural and multimodal recognition. IEEE Transactions...
    • Wu, W.-L., Lu, R.-Z., Duan, J.-Y., Liu, H., Gao, F., and Chen, Y.-Q. Spoken language understanding using weakly supervised learning. Computer...
    • Young, S. The Statistical Approach to the Design of Spoken Dialogue Systems. Tech. rep., Cambridge University Engineering Department (UK),...

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno