Speech and Multimodal Interfaces Laboratory

Publications

2020

Dvoynikova A., Verkholyak O., Karpov A. Sentiment-analysis of spoken language using a method based on tonal dictionaries // Almanac of scientific works of young scientists of ITMO University. 2020. vol. 3. pp. 75-80.

Ryumina E. A method for extracting informative video features for emotion recognition // Almanac of scientific works of ITMO University young scientists. 2020, vol. 3. pp. 151-155.

Axyonov A., Ryumina E. Analytical review of modern methods of face detection // Almanac of scientific works of ITMO University young scientists. 2020, vol. 3. pp. 12-19.

Markitantov M. Analytical survey of audiovisual speech corpora for automatic speaker’s age recognition // Almanac of scientific works of young scientists of the University ITMO. 2020, vol. 3, pp. 124-128.

Verkholyak O., Karpov A. Chapter 4 "Automatic analysis of emotionally-colored speech" in a monograph "Child speech portrait with typical and atypical development" / Lyakso E., Frolova O., Grechaniy S., Matveev Yu., Verkholyak O., Karpov A. / St. Petersburg: Publishing and Printing Association of Higher Educational Institutions, 2020, 204 p. ISBN 978-5-91155-096-7.

Ivanko D., Ryumin D., Kipyatkova I., Axyonov A., Karpov A. Lip-reading Using Pixel-based and Geometry-based Features for Multimodal Human-Robot Interfaces // Smart Innovation, Systems and Technologies, Springer, vol. 154, Zavalishin’s Readings 2019, 2020, pp. 477-486.

Ryumin D., Ivanko D., Kagirov I., Axyonov A., Karpov A. Vision-Based Assistive Systems for Deaf and Hearing Impaired People // In: Favorskaya M., Jain L. (eds) Computer Vision in Advanced Control Systems-5, Intelligent Systems Reference Library, Springer, vol. 175, 2020, pp. 197-224.

2019

Verkholyak O., Fedotov D., Kaya H., Zhang Y., Karpov A. Hierarchical Two-Level Modelling of Emotional States in Spoken Dialog Systems. In Proc. 44th IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2019, Brighton, UK, 2019, pp. 6700-6704.

Kaya H., Fedotov D., Dresvyanskiy D., Doyran M., Mamontov D., Markitantov M., Akdag Salah A., Kavcar E., Karpov A., Salah A.A. Predicting depression and emotions in the cross-roads of cultures, para-linguistics, and non-linguistics. In Proc. 9th ACM International Audio/Visual Emotion Challenge and Workshop AVEC’19, Nice, France, 2019, ACM, New York, NY, USA, 9 pages.

Ryumin D., Ivanko D., Kagirov I., Axyonov A., Karpov A., Zelezny M. Human-Robot Interaction with Smart Shopping Trolley using Sign Language: Data Collection. In Proc. 2019 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom Workshops 2019, Kyoto, Japan, 2019, pp. 949-954.

Akhtiamov O., Siegert I., Karpov A., Minker W. Cross-Corpus Data Augmentation for Acoustic Addressee Detection. In Proc. 20th ACL International Conference on Discourse and Dialogue SIGDial-2019, Stockholm, Sweden, 2019, pp. 274-283.

Fedotov D., Kim B., Karpov A., Minker W. Time-Continuous Emotion Recognition Using Spectrogram Based CNN-RNN Modelling // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 93-102.

Yu J., Markov K., Karpov A. Speaking Style Based Apparent Personality Recognition // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 540-548.

Verkholyak O.V., Kaya H., Karpov A.A. Modeling short-term and long-term dependencies of the speech signal for paralinguistic emotion classification // SPIIRAS Proceedings, Issue 62, № 1, 2019, pp. 30-56.

Ivanko D.V., Ryumin D.A., Karpov A.A., Zhelezny M. Investigation of the influence of high-speed video data on the recognition accuracy of audiovisual speech // Information-Control Systems (Informatsionno-Upravliaiushchie Sistemy) [Information and Control Systems], No. 2, 2019, pp. 26-34.