Speech and Multimodal Interfaces Laboratory

Publications

2023

Ryumin D., Ivanko D., Ryumina E. Audio-Visual Speech and Gesture Recognition by Sensors of Mobile Devices // Sensors. 2023. vol. 23(4). no. 2284.
Axyonov A.A., Ryumina E.V., Ryumin D.A., Ivanko D.V., Karpov A.A. Neural network-based method for visual recognition of driver’s voice commands using attention mechanism // Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2023. vol. 23. no. 4. pp. 767–775.
Kipyatkova I., Kagirov I. Automatic speech recognition system for Karelian // Information and Control Systems. 2023. vol. 3. pp. 16-25.
Velichko A., Karpov A. An approach and software system for integral analysis of destructive paralinguistic phenomena in colloquial speech // Information and Control Systems. 2023. vol. 4. pp. 2-11.
Dvoynikova A., Karpov A. Bimodal Sentiment and Emotion Classification with Multi-Head Attention Fusion of Acoustic and Linguistic Information // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference «Dialogue 2023». 2023. vol. 22. pp. 51–61.
Ryumin D., Ivanko D., Axyonov A. Cross-Language Transfer Learning Using Visual Information for Automatic Sign Gesture Recognition // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2023. vol. XLVIII. pp. 209–216.
Ryumina E., Karpov A. Impact of Visual Modalities in Multimodal Personality and Affective Computing // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2023. vol. 48. pp. 217–224.
Ivanko D., Ryumina E., Ryumin D. Improved Automatic Lip-Reading Based on the Evaluation of Intensity Level of Speaker’s Emotion // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2023. vol. 48. pp. 89–94
Ivanko D., Ryumina E., Ryumin D., Axyonov A., Kashevnik A., Karpov A. EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech Recognition// In Proc. of the 25th International Conference on Speech and Computer SPECOM-2023. Lecture Notes in Computer Science. LNAI. 2023. vol. 14338. pp. 18–31.
Kipyatkova I., Kagirov I. Phone Durations Modeling for Livvi-Karelian ASR// In Proc. of the 25th International Conference on Speech and Computer SPECOM-2023. Lecture Notes in Computer Science. LNAI. 2023. vol. 14339. pp. 87–99.
Karpov A., Dvoynikova A., Ryumina E. Intelligent Interfaces and Systems for Human-Computer Interaction// In Proc. of the 7th International Scientific Conference “Intelligent Information Technologies for Industry” IITI-2023. Lecture Notes in Networks and Systems. pp. 3–13.
Kipyatkova I. S., Rodionova A. P., Kagirov I. A., Krizhanovsky A. A. Speech and text data preparation for developing an automatic speech recognition system for the Karelian language // In Proc. of Petrozavodsk State University. 2023. vol. 45(5). pp. 89–98.
Dvoynikova A. A., Kondratenko K. K. Approach to automatic recognition of emotions in speech transcriptions // Journal of Instrument Engineering. 2023. vol. 66, no 10. pp. 818-827.
Kagirov I. An analytical survey of gesture information registration systems in the context of aerospace research // Aerospace Instrument-Making. 2023. no. 10. pp. 35-46.
Povolotskaia A.A., Evdokimova V.V., Skrelin P.A. Recording and evaluation of speech data set for negative emotions recognition in speech // Terra Linguistica, 14 (2) (2023) 59–76.