Speech and Multimodal Interfaces Laboratory

Publications

2024

Bubeev Yu.A., Syrkin L.D., Polyakov A.V., Usov V.M., Karpov A.A., Ivanov A.V. Practice of teleconsultation for evaluation of personal adaptation potential and detection of indications for providing qualified psychological services to employees in remote regions of Russia // Aerospace and Environmental Medicine (Aviakosmicheskaya i Ekologicheskaya Meditsina). 2024. vol. 58. pp. 5-16.

Axyonov A., Ryumin D., Ivanko D., Kashevnik A., Karpov A. Audio-Visual Speech Recognition In-The-Wild: Multi-Angle Vehicle Cabin Dataset and Attention-Based Approach // In Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2024. pp. 8195-8199.

Ryumina E., Ryumin D., Karpov A. OCEAN-AI: Open Multimodal Framework for Personality Traits Assessment and HR-processes Automatization // In Proc. of INTERSPEECH. 2024. pp. 3630-3631.

Ivanko D., Ryumin D., Axyonov A., Kashevnik A., Karpov A. OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled People // Lecture Notes in Computer Science, SPECOM-2024. 2024. vol. 15299. pp. 163-173.

Kipyatkova I., Kagirov I., Dolgushin M., Rodionova A. Towards a Livvi-Karelian End-to-End ASR System // Lecture Notes in Computer Science, SPECOM-2024. 2024. vol. 15299. pp. 57-68.

Guseva D., Mitrofanova O., Dolgushin M. Human and Machine Keyphrase Perception in Russian Text and Speech // Lecture Notes in Computer Science, SPECOM-2024. 2024. vol. 15299. pp. 265-280.

Kosulin K., Karpov A. A Survey of Masked Face Recognition Methods and Corpora/Data // Springer Geography. IMS-2022. 2024. pp. 27-37.

Ivanko D., Ryumin D., Markitantov M. End-to-End Visual Speech Recognition for Human-Robot Interaction // In Proc. of the AIP Conference. 2024. vol. 3021. pp. 82-90.

Dvoynikova A. A., Karpov A. A. Method of creating multimodal databases for audiovisual analysis of engagement and emotions of virtual communication participants // Journal of Instrument Engineering. 2024. vol. 67. no. 11. pp. 984–993

2023

Ryumina E., Ryumin D., Markitantov M., Kaya H., Karpov A. Multimodal Personality Traits Assessment (MuPTA) Corpus: The Impact of Spontaneous and Read Speech// In Proc. of the 24th International Conference INTERSPEECH-2023. 2023. pp. 4049–4053.

Karpov A., Samudravijaya K., Deepak K.T., Hegde R.M., Agrawal S.S., Prasanna S.R.M. SPECOM 2023 Preface. Lecture Notes in Computer Science// In Proc. of the 25th International Conference on Speech and Computer SPECOM-2023. LNAI. 2023. vol. 14338/14339.

Ivanko D., Ryumin D., Karpov A. A Review of Recent Advances on Deep Learning Methods for Audio-Visual Speech Recognition // Mathematics. 2023. vol. 11(12). no. 2665.

Kipyatkova I., Kagirov I. Deep Models for Low-Resourced Speech Recognition: Livvi-Karelian Case // Mathematics. 2023. vol. 11(18). no. 3814.

Ryumin D., Ryumina E., Ivanko D. EMOLIPS: Towards Reliable Emotional Speech Lip-Reading // Mathematics. 2023. vol. 11(23). no. 4787.

Ryumina E., Markitantov M., Karpov A. Multi-Corpus Learning for Audio–Visual Emotions and Sentiment Recognition // Mathematics. 2023. vol. 11(16). no. 3519.