Speech and Multimodal Interfaces Laboratory

Paper in the international journal IEEE Access (Q1)

Our laboratory and colleagues from the Laboratory of computer-aided integrated systems published paper in the IEEE Access journal (Scopus, Q1):

Kashevnik A., Lashkov I., Axyonov A., Ivanko D., Ryumin D., Kolchin A., Karpov A. Multimodal Corpus Design for Audio-Visual Speech Recognition in Vehicle Cabin // IEEE Access, IEEE, 2021, vol. 9, pp. 34986-35003. DOI: 10.1109/ACCESS.2021.3062752

This paper introduces a new methodology aimed at comfort for the driver in-the-wild multimodal corpus creation for audio-visual speech recognition in driver monitoring systems. The presented methodology is universal and can be used for corpus recording for different languages. Multimodal speech recognition allows using audio data when video data are useless (e.g. at nighttime), as well as applying video data in acoustically noisy conditions (e.g., at highways). In addition, we created RUSAVIC corpus using the developed mobile application that at the moment a unique audiovisual corpus for the Russian language that is recorded in-the-wild condition.