Speech and Multimodal Interfaces Laboratory

Publications

2022

Ryumina E.V. An analytical review of corpora for automatic assessment of human psychophysical states // Almanac of scientific works of young scientists of ITMO University. 2022. Vol. 2. pp. 363-365.

Ryumina E.V., Ivanko D. A review of multimodal corpora for studying the effect of speaker's emotional state on automatic lip-phrase recognition // Almanac of scientific works of young scientists of ITMO University. 2022. Vol. 2. pp. 366-369.

Rolinsky S., Dvoynikova A. An analytical review of methods for extracting textual transcriptions from speech utterances // Almanac of scientific works of young scientists of ITMO University. 2022. Vol. 2. pp. 336-340.

Ryumina E.V. Method of intellectual estimation of personal qualities of a human personality by visual data // Proceedings of the XI Congress of Young Scientists of ITMO University. 2022. Vol. 2. pp. 127-131.

Dvoynikova A. Recognizing speaker engagement using mel-spectrogram analysis // Proceedings of the XI Congress of Young Scientists of ITMO University. 2022. Vol. 2. pp. 38-42.

2021

Kipyatkova I., Karpov A., Kuleshov S., Zaytseva A. Methods and Models for automatic speech recognition. Educational Manual - Spb: SPC RAS. 2021. P. 116.

Kuleshov S., Zaytseva A., Aksenov A., Karpov A., Kipyatkova I., Vatamaniuk I. 3D technologies in state-of-the-art information systems: theory and practice. Educational-Methodical Manual - Spb: SPC RAS. 2021. P. 83.

Verkholyak O., Dvoynikova A., Karpov A. A Bimodal Approach for Speech Emotion Recognition using Audio and Text // Journal of Internet Services and Information Security (JISIS). Korea. 2021. Vol. 11(1). pp. 80-96.

Bojanić M., Delić V., Karpov A. Influence of Emotion Distribution and Classification on a Call Processing for an Emergency Call Center // Telfor Journal. Serbia. 2021. Vol. 13(2). pp. 75-80.

Verkholyak O., Dresvyanskiy D., Dvoynikova A., Kotov D., Ryumina E., Velichko A., Mamontov D., Minker W., Karpov A. Ensemble-Within-Ensemble Classification for Escalation Prediction from Speech // In Proc. International Conference INTERSPEECH-2021. ISCA. Brno, Czechia. 2021. pp. 481-485.

Dresvyanskiy D., Minker W., Karpov A. Deep Learning Based Engagement Recognition in Highly Imbalanced Data // In Proc. SPECOM 2021. Lecture Notes in Computer Science, Springer. Vol. 12997. 2021. pp. 166-178.

Gruber I., Hrúz M., Železný M., Karpov A. X-Bridge: Image-to-Image Translation with Reconstruction Capabilities // In Proc. SPECOM 2021. Lecture Notes in Computer Science, Springer. Vol. 12997. 2021. pp. 238-249.

Dresvyanskiy D., Siegert I., Karpov A., Minker W. Engagement Recognition Using Audio Channel Only // In Proc. 1st AI-Debate Workshop: establishing An InterDisciplinary pErspective on speech-based Technology. Magdeburg, Germany. 2021. pp. 19-22.

Kashevnik A., Lashkov I., Axyonov A., Ivanko D., Ryumin D., Kolchin A., Karpov A. Multimodal Corpus Design for Audio-Visual Speech Recognition in Vehicle Cabin // IEEE Access, IEEE Press. 2021. Vol. 9. pp. 34986-35003.

Kagirov I., Kapustin A., Kipyatkova I., Klyuzhev K., Kudryavcev A., Kudryavcev I., Loskutov Y., Ryumin D., Karpov A. Medical exoskeleton “Remotion” with an intelligent control system: Modeling, implementation, and testing // Simulation Modelling Practice and Theory. Elsevier. 2021. Vol. 107. ID 102200.