Speech and Multimodal Interfaces Laboratory

Paper in the international journal Neurocomputing (Q1)

Our laboratory, together with German colleagues from the Ulm University, published an article in the journal Neurocomputing (Scopus, Q1):

Ryumina E., Dresvyanskiy D., Karpov A. In Search of a Robust Facial Expressions Recognition Model: A Large-Scale Visual Cross-Corpus Study // Neurocomputing. 2022. Vol. 514. pp. 435-450

The paper presents the largest visual cross-corpus study conducted with the utilization of eight corpora, which differ in recording conditions, participants’ appearance characteristics, and complexity of data processing. We propose a visual-based end-to-end emotion recognition framework, which consists of the robust pre-trained backbone model and temporal sub-system in order to model temporal dependencies across many video frames. In addition, a detailed analysis of mistakes and advantages of the backbone model is provided, demonstrating its high ability of generalization. Our results show that the backbone model has achieved the accuracy of 66.4% on the AffectNet dataset, outperforming all the state-of-the-art results. Moreover, the CNN-LSTM model has demonstrated a decent efficacy on dynamic visual datasets during cross-corpus experiments, achieving comparable with state-of-the-art results.