Speech and Multimodal Interfaces Laboratory

Publications

2023

Ryumina E., Ryumin D., Markitantov M., Kaya H., Karpov A. Multimodal Personality Traits Assessment (MuPTA) Corpus: The Impact of Spontaneous and Read Speech// In Proc. of the 24th International Conference INTERSPEECH-2023. 2023. pp. 4049–4053.
Karpov A., Samudravijaya K., Deepak K.T., Hegde R.M., Agrawal S.S., Prasanna S.R.M. SPECOM 2023 Preface. Lecture Notes in Computer Science// In Proc. of the 25th International Conference on Speech and Computer SPECOM-2023. LNAI. 2023. vol. 14338/14339.
Ivanko D., Ryumin D., Karpov A. A Review of Recent Advances on Deep Learning Methods for Audio-Visual Speech Recognition // Mathematics. 2023. vol. 11(12). no. 2665.
Kipyatkova I., Kagirov I. Deep Models for Low-Resourced Speech Recognition: Livvi-Karelian Case // Mathematics. 2023. vol. 11(18). no. 3814.
Ryumin D., Ryumina E., Ivanko D. EMOLIPS: Towards Reliable Emotional Speech Lip-Reading // Mathematics. 2023. vol. 11(23). no. 4787.
Ryumina E., Markitantov M., Karpov A. Multi-Corpus Learning for Audio–Visual Emotions and Sentiment Recognition // Mathematics. 2023. vol. 11(16). no. 3519.
Ryumin D., Ivanko D., Ryumina E. Audio-Visual Speech and Gesture Recognition by Sensors of Mobile Devices // Sensors. 2023. vol. 23(4). no. 2284.
Axyonov A.A., Ryumina E.V., Ryumin D.A., Ivanko D.V., Karpov A.A. Neural network-based method for visual recognition of driver’s voice commands using attention mechanism // Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2023. vol. 23. no. 4. pp. 767–775.
Kipyatkova I., Kagirov I. Automatic speech recognition system for Karelian // Information and Control Systems. 2023. vol. 3. pp. 16-25.
Velichko A., Karpov A. An approach and software system for integral analysis of destructive paralinguistic phenomena in colloquial speech // Information and Control Systems. 2023. vol. 4. pp. 2-11.
Dvoynikova A., Karpov A. Bimodal Sentiment and Emotion Classification with Multi-Head Attention Fusion of Acoustic and Linguistic Information // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference «Dialogue 2023». 2023. vol. 22. pp. 51–61.
Ryumin D., Ivanko D., Axyonov A. Cross-Language Transfer Learning Using Visual Information for Automatic Sign Gesture Recognition // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2023. vol. XLVIII. pp. 209–216.
Ryumina E., Karpov A. Impact of Visual Modalities in Multimodal Personality and Affective Computing // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2023. vol. 48. pp. 217–224.
Ivanko D., Ryumina E., Ryumin D. Improved Automatic Lip-Reading Based on the Evaluation of Intensity Level of Speaker’s Emotion // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2023. vol. 48. pp. 89–94
Ivanko D., Ryumina E., Ryumin D., Axyonov A., Kashevnik A., Karpov A. EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition// In Proc. of the 25th International Conference on Speech and Computer SPECOM-2023. Lecture Notes in Computer Science. LNAI. 2023. vol. 14338. pp. 18–31.