Special Session of CBMI 2024 conference: Multimodal Data Analysis for Understanding of Human Behaviour, Emotions and their Reasons

This special session addresses the processing of all types of data related to understanding of human behaviour, emotions, and their reasons, such as current or past context. Understanding human behaviour and context may be beneficial for many services both online and in physical spaces. For example, detecting lack of skills, confusion or other negative states may help to adapt online learning programmes, to detect a bottleneck in the production line, to recognise poor workplace culture etc., or maybe to detect a dangerous spot on a road before any accident happens there. Detection of unusual behaviour may help to improve security of travellers and safety of dementia sufferers and visually/ audio impaired individuals, for example, to help them to stay away from potentially dangerous strangers, e.g., drunk people or football fans forming a big crowd.

In context of multimedia retrieval, understanding human behaviour and emotions could help not only for multimedia indexing, but also to derive implicit (i.e., other than intentionally reported) human feedback regarding multimedia news, videos, advertisements, navigators, hotels, shopping items etc. and improve multimedia retrieval.

Humans are good at understanding other humans, their emotions and reasons. For example, when looking at people engaged in different activities (sport, driving, working on a computer, working in a construction site, using public transport etc.), a human observer can understand whether a person is engaged in the task or distracted, stopped the recommended video because the video was not interesting, or because the person quickly found what he needed in the beginning of the video. After observing another human for some time, humans can also learn his/ her tastes, skills and personality traits.

Hence the interest of this session is, how to improve AI understanding of the same aspects? The topics include (but are not limited to) the following

  • Use of various sensors for monitoring and understanding human behaviour, emotion/ mental state/ cognition, and context: video, audio, infrared, wearables, virtual (e.g., mobile device usage, computer usage) etc.
  • Methods for information fusion, including information from various heterogeneous sources
  • Methods to learn human traits and preferences from long term observations
  • Methods to detect human implicit feedback from past and current observations
  • Methods to assess task performance: skills, emotions, confusion, engagement in the task, context
  • Methods to detect potential security and safety threats and risks
  • Methods to adapt behavioural and emotional models to different end users and contexts without collecting a lot of labels from each user and/ or for each context: transfer learning, semi-supervised learning, anomaly detection, one-shot learning etc.
  • How to collect data for training AI methods from various sources, e.g., internet, open data, field pilots etc.
  • Use of behavioural or emotional data to model humans and adapt services either online or in physical spaces.
  • Ethics and privacy issues in modelling human emotions, behaviour, context and reasons


Elena Vildjiounaite, VTT Technical Research Centre of Finland. Contact: elena.vildjiounaite@vtt.fi

Johanna Kallio, VTT Technical Research Centre of Finland. Contact: johanna.kallio@vtt.fi

Sari Järvinen, VTT Technical Research Centre of Finland. Contact: sari.jarvinen@vtt.fi

Satu-Marja Mäkela, VTT Technical Research Centre of Finland. Contact: Satu-Marja.Makela@vtt.fi

Johannes Peltola, VTT Technical Research Centre of Finland. Contact: johannes.peltola@vtt.fi

Benjamin Allaert, IMT-Nord-Europe, France. Contact: benjamin.allaert@imt-nord-europe.fr

Ioan Marius Bilasco, University of Lille, France. Contact: marius.bilasco@univ-lille.fr 

Franziska Schmalfuss, IAV GmbH,  Germany. Contact: franziska.schmalfuss@iav.de