Speech and Multimodal Interfaces Laboratory



Kipyatkova I., Markovnikov N. A Study of Methods for Improving End-to-End Speech Recognition System at Lack of Training Data // Proceedings of III All-Russian Acoustic Conference, St. Petersburg, 2020, pp. 361-367.
Axyonov А., Ryumin D., Kagirov I., Ivanko D., Karpov A. A technique for hand landmarks detection for contactless gesture-based human-machine interaction // Proceedings of 31st International Scientific and Technological Conference «Extreme Robotics», St. Petersburg, 2020, pp. 34-36.
Mikhajlyuk М., Karpov А., Kryuchkov B., Usov V., Dovzhenko V. Voice control of service robots under conditions of possible limitations of human motor functions in space flight // Proceedings of the XII All-Russian scientific-technical conference "Robotics and artificial intelligence", 2020, pp. 197-201.
Dvoynikova A., Verkholyak O., Karpov A. Sentiment-analysis of spoken language using a method based on tonal dictionaries // Almanac of scientific works of young scientists of ITMO University. 2020. vol. 3. pp. 75-80.
Ryumina E. A method for extracting informative video features for emotion recognition // Almanac of scientific works of ITMO University young scientists. 2020, vol. 3. pp. 151-155.
Axyonov A., Ryumina E. Analytical review of modern methods of face detection // Almanac of scientific works of ITMO University young scientists. 2020, vol. 3. pp. 12-19.
Markitantov M. Analytical survey of audiovisual speech corpora for automatic speaker’s age recognition // Almanac of scientific works of young scientists of the University ITMO. 2020, vol. 3, pp. 124-128.
Verkholyak O., Karpov A. Chapter 4 "Automatic analysis of emotionally-colored speech" in a monograph "Child speech portrait with typical and atypical development" / Lyakso E., Frolova O., Grechaniy S., Matveev Yu., Verkholyak O., Karpov A. / St. Petersburg: Publishing and Printing Association of Higher Educational Institutions, 2020, 204 p. ISBN 978-5-91155-096-7.
Ivanko D., Ryumin D., Kipyatkova I., Axyonov A., Karpov A. Lip-reading Using Pixel-based and Geometry-based Features for Multimodal Human-Robot Interfaces // Smart Innovation, Systems and Technologies, Springer, vol. 154, Zavalishin’s Readings 2019, 2020, pp. 477-486.
Ryumin D., Ivanko D., Kagirov I., Axyonov A., Karpov A. Vision-Based Assistive Systems for Deaf and Hearing Impaired People // In: Favorskaya M., Jain L. (eds) Computer Vision in Advanced Control Systems-5, Intelligent Systems Reference Library, Springer, vol. 175, 2020, pp. 197-224.


Verkholyak O., Fedotov D., Kaya H., Zhang Y., Karpov A. Hierarchical Two-Level Modelling of Emotional States in Spoken Dialog Systems. In Proc. 44th IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2019, Brighton, UK, 2019, pp. 6700-6704.
Kaya H., Fedotov D., Dresvyanskiy D., Doyran M., Mamontov D., Markitantov M., Akdag Salah A., Kavcar E., Karpov A., Salah A.A. Predicting depression and emotions in the cross-roads of cultures, para-linguistics, and non-linguistics. In Proc. 9th ACM International Audio/Visual Emotion Challenge and Workshop AVEC’19, Nice, France, 2019, ACM, New York, NY, USA, 9 pages.
Ryumin D., Ivanko D., Kagirov I., Axyonov A., Karpov A., Zelezny M. Human-Robot Interaction with Smart Shopping Trolley using Sign Language: Data Collection. In Proc. 2019 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom Workshops 2019, Kyoto, Japan, 2019, pp. 949-954.
Akhtiamov O., Siegert I., Karpov A., Minker W. Cross-Corpus Data Augmentation for Acoustic Addressee Detection. In Proc. 20th ACL International Conference on Discourse and Dialogue SIGDial-2019, Stockholm, Sweden, 2019, pp. 274-283.
Fedotov D., Kim B., Karpov A., Minker W. Time-Continuous Emotion Recognition Using Spectrogram Based CNN-RNN Modelling // Lecture Notes in Computer Science, Springer LNAI 11658, SPECOM 2019, 2019, pp. 93-102.