Speech and Multimodal Interfaces Laboratory

Publications

2019

Verkholyak O., Fedotov D., Kaya H., Zhang Y., Karpov A. Hierarchical Two-Level Modelling of Emotional States in Spoken Dialog Systems. In Proc. 44th IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2019, Brighton, UK, 2019, pp. 6700-6704.
More
Kaya H., Fedotov D., Dresvyanskiy D., Doyran M., Mamontov D., Markitantov M., Akdag Salah A., Kavcar E., Karpov A., Salah A.A. Predicting depression and emotions in the cross-roads of cultures, para-linguistics, and non-linguistics. In Proc. 9th ACM International Audio/Visual Emotion Challenge and Workshop AVEC’19, Nice, France, 2019, ACM, New York, NY, USA, 9 pages.
More
Ryumin D., Ivanko D., Kagirov I., Axyonov A., Karpov A., Zelezny M. Human-Robot Interaction with Smart Shopping Trolley using Sign Language: Data Collection. In Proc. 2019 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom Workshops 2019, Kyoto, Japan, 2019, pp. 949-954.
More
Akhtiamov O., Siegert I., Karpov A., Minker W. Cross-Corpus Data Augmentation for Acoustic Addressee Detection. In Proc. 20th ACL International Conference on Discourse and Dialogue SIGDial-2019, Stockholm, Sweden, 2019, pp. 274-283.
More
Fedotov D., Kim B., Karpov A., Minker W. Time-Continuous Emotion Recognition Using Spectrogram Based CNN-RNN Modelling // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 93-102.
More
Yu J., Markov K., Karpov A. Speaking Style Based Apparent Personality Recognition // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 540-548.
More
Verkholyak O.V., Kaya H., Karpov A.A. Modeling short-term and long-term dependencies of the speech signal for paralinguistic emotion classification // SPIIRAS Proceedings, Issue 62, № 1, 2019, pp. 30-56.
More
Ivanko D.V., Ryumin D.A., Karpov A.A., Zhelezny M. Investigation of the influence of high-speed video data on the recognition accuracy of audiovisual speech // Information-Control Systems (Informatsionno-Upravliaiushchie Sistemy) [Information and Control Systems], No. 2, 2019, pp. 26-34.
More
Fedotov D.V., Verkholyak O.V., Karpov A.A. Contextual continuous recognition of emotions in Russian speech using recurrent neural networks. Proceedings of the 8th Interdisciplinary Seminar “Analysis of Conversational Russian Speech” AR3-2019, St. Petersburg, St. Petersburg State University, 2019, pp. 96-99.
Ryumin D., Kagirov I., Ivanko D., Axyonov A., Karpov A. Automatic detection and recognition of 3D manual gestures for human-machine interaction. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ISPRS Archives 42(2/W12), 2019, pp. 179-183.
More
Ivanko D., Ryumin D., Karpov A. Automatic lip-reading of hearing impaired people // International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives 42(2/W12), 2019, pp. 97-101.
More
Kipyatkova I. LSTM-Based Language Models for Very Large Vocabulary Continuous Russian Speech Recognition System // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 219-226.
More
Markovnikov N., Kipyatkova I. Investigating Joint CTC-Attention Models for End-to-End Russian Speech Recognition // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 337-347.
Markitantov M., Verkholyak O. Automatic Recognition of Speaker Age and Gender Based on Deep Neural Networks // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 327-336.
Kagirov I., Ryumin D., Axyonov A. Method for Multimodal Recognition of One-Handed Sign Language Gestures Through 3D Convolution and LSTM Neural Networks // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 191-200.