Modality- and Subject-Aware Emotion Recognition Using Knowledge Distillation

Mehmet Ali Sarikaya, Gokhan Ince*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal emotion recognition has the potential to impact various fields, including human-computer interaction, virtual reality, and emotional intelligence systems. This study introduces a comprehensive framework that enhances the accuracy and computational efficiency of emotion recognition by leveraging knowledge distillation and transfer learning, incorporating both unimodal and multimodal models. The framework also combines subject-specific and subject-independent models, achieving a balance between localization and generalization. Subject-independent models include EEG-based, non-EEG-based (i.e., electromyography, electrooculography, electrodermal activity, galvanic skin response, skin temperature, respiration, blood volume pulse, heart rate, and eye movements), and multimodal models trained on all training subjects, capturing a broader context. Subject-specific models, including EEG-based, non-EEG-based, and multimodal models, are trained on individual subjects to provide localized knowledge. The proposed framework then distills knowledge from these teacher models into a student model, utilizing six different distillation losses to combine both subject-independent and subject-specific insights. This approach makes the model subject-aware by using local patterns and modality-aware by incorporating unimodal data, enhancing the robustness and generalizability of emotion recognition systems to varied real-world scenarios. The framework was tested on two well-known datasets, SEED-V and DEAP, as well as an immersive three-Dimensional (3D) Virtual Reality (VR) dataset, GraffitiVR, which captures emotional and behavioral responses from individuals experiencing urban graffiti in a VR environment. This broader application provides insights into the effectiveness of emotion recognition models in both 2D and 3D settings, facilitating a wider range of assessment. Empirical results demonstrate that the proposed knowledge distillation-based model significantly elevates performance across all datasets when compared to traditional models. Specifically, the model demonstrated improvements ranging from 6.56% to 24.59% over unimodal models and from 1.56% to 4.11% over multimodal approaches across the SEED-V, DEAP, and GraffitiVR datasets. These results underscore the robustness and effectiveness of the proposed approach, suggesting that it significantly enhances emotion recognition processes across various environmental settings.

Original languageEnglish
Pages (from-to)122485-122502
Number of pages18
JournalIEEE Access
Volume12
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
© 2024 The Authors.

Keywords

  • Brain-computer interface
  • cross-modal distillation
  • EEG-based models
  • emotion recognition
  • knowledge distillation
  • multimodal models
  • subject-independent models
  • subject-specific models
  • transfer learning
  • virtual reality

Fingerprint

Dive into the research topics of 'Modality- and Subject-Aware Emotion Recognition Using Knowledge Distillation'. Together they form a unique fingerprint.

Cite this