Speech and Multimodal Interfaces Laboratory

Publications

2024

Dresvyanskiy D., Markitantov M., Yu J., Kaya H., Karpov A. Multi-modal Arousal and Valence Estimation under Noisy Conditions // IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2024. pp. 4773-4783.

Ryumina E., Markitantov M., Ryumin D., Kaya H., Karpov A. Zero-Shot Audio-Visual Compound Expression Recognition Method based on Emotion Probability Fusion // IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2024. pp. 4752-4760.

Dresvyanskiy D., Karpov A., Minker W. A Cross-Multi-modal Fusion Approach for Enhanced Engagement Recognition // Lecture Notes in Computer Science, SPECOM-2024. 2024. vol. 15300. pp. 3-17.

Mamontov D., Zepf S., Karpov A., Minker W. Cross-Cultural Automatic Depression Detection Based on Audio Signals // Lecture Notes in Computer Science, SPECOM-2024. 2024. vol. 15299. pp. 309-323.

Karpov A., Delić V. SPECOM 2024 Preface // Proc. 26th International Conference on Speech and Computer (SPECOM). 2024. vol. 15299 / 15300. pp. v-vi.

Ryumin D., Axyonov A., Ryumina E., Ivanko D., Kashevnik A., Karpov A. Audio–visual speech recognition based on regulated transformer and spatio–temporal fusion strategy for driver assistive systems // Expert Systems with Applications. 2024. vol. 252. ID 124159.

Ryumina E., Markitantov M., Ryumin D., Karpov A. OCEAN-AI framework with EmoFormer cross-hemiface attention approach for personality traits assessment // Expert Systems with Applications. 2024. vol. 239. ID 122441.

Ryumina E., Markitantov M., Ryumin D., Karpov A. Gated Siamese Fusion Network based on multimodal deep and hand-crafted features for personality traits assessment // Pattern Recognition Letters. 2024. vol. 185. pp. 45-51.

Othman W., Kashevnik A., Ali A., Shilov N., Ryumin D. Remote Heart Rate Estimation Based on Transformer with Multi-Skip Connection Decoder: Method and Evaluation in the Wild // Sensors. 2024. vol. 24. pp. 775.

Dvoynikova A., Kagirov I., Karpov A. A Method for Recognition of Sentiment and Emotions in Russian Speech Transcripts Using Machine Translation // Informatics and Automation. 2024. vol. 23. no. 4. pp. 1173-1198.

Povolotskaia A., Karpov A. Analytical Review of Methods for Automatic Analysis of Extra-Linguistic Units in Spontaneous Speech // Informatics and Automation. 2024. vol. 23. no. 1. pp. 5-38.

Ivanko D.V., Ryumin D.A. Automatic sign language translation: a review of neural network methods for recognition and synthesis of spoken and signed language // Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2024. vol. 24. no. 5. pp. 669–686.

Uzdiaev M. Yu., Karpov A. A. Creation and analysis of multimodal corpus for aggressive behavior recognition // Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2024. vol. 24. no. 5. pp. 834-842.

Velichko A., Karpov A. An approach for depression recognition by speech using a semi-automatic data annotation // Information and Control Systems. 2024. no. 4. pp. 2-11.

Kapusta K., Kipyatkova I., Kagirov I. Analytical survey of transformer-based end-to-end speech recognition models and strategies // Information and Control Systems. 2024. no. 5. pp. 2-15.