Speech and Multimodal Interfaces Laboratory

We received the RSF and young scientists grants, and the laboratory was awarded the status of a leading scientific school

We received grants from the Russian Science Foundation (RSF):

RSF No 21-71-00141 «Research and development of new methods and approaches to the task of automatic sign language recognition»

Head: Ryumin Dmitry Alexandrovich
Period: 2021-2023
The purpose of the project is to develop information, mathematical and software that allows to improve the results available today in the field of automatic recognition of information transmitted by means of a gesture, including sign languages. The system prototype under development will support Russian sign language recognition. The results obtained during the project will be extremely important because it will contribute to improving the quality of life of people with disabilities, in particular, hearing impaired people.
RSF No 21-71-00132 «Development and research of End-to-end audio-visual speech recognition system using deep neural networks»

Head: Ivanko Denis Viktorovich
Period: 2021-2023
The main goal of the project is to develop and research an end-to-end system for automatic recognition of audiovisual speech using deep neural networks. The objectives of this project are the development, theoretical and experimental research of mathematical, software and information-linguistic support for an integrated audio-visual speech recognition system. The results obtained in the course of this project will be used for further fundamental research and development in the field of speech technologies, dialogue systems, human-machine interaction and artificial intelligence.
RSF No 22-21-00843 «Automatic speech recognition for under-resourced languages of Russia (on the example of the Karelian language)»

Head: Kipyatkova Irina Sergeevna
Period: 2022-2023
This project is aimed at development of a prototype system for automatic speech-to-text conversion for the Karelian language. The developing system can be used for machine translation from Karelian to Russian. In addition, automatic speech recognition systems can be used in automatic stenography systems for under-resourced and endangered languages, being a useful tool for preservation and research.
RSF No 22-11-00321 «Intelligent system for multimodal recognition of human's affective states»

Head: Karpov Alexey Anatolyevich
Period: 2022-2024
The main goal of this RSF project is to develop and research a new intelligent computer system for multimodal analysis of human behavior in order to recognize manifested affective states based on audio, video and text data from a person. A unique feature of the system will be the multimodal analysis, i.e. simultaneous automatic analysis of the user's speech and image, as well as the meaning of his statements for the purpose of determining various psychoemotional (affective), including emotions, sentiment, aggression and depression. At the same time, the target audience of the automated system being developed will include not only the Russian-speaking population, but also other representative groups regardless of gender, age, race and language. Thus, this study is relevant and large-scale both within the framework of Russian and world science.

In addition, we received a Grant of the President and our laboratory was awarded the status of a leading scientific school:

Grant of the President: № MK-42.2022.4 «Investigation of the influence of the speaker's emotional state on the automatic audio-visual speech recognition»

Head: Ivanko Denis Viktorovich
Period: 2022-2023
The main goal of the project is a comprehensive study of the influence of various emotional states of the speaker (such as fear, anger, sadness, happiness, etc.) on the accuracy of automatic speech recognition based on audio and video information. The research results will lead to an increase in the efficiency (accuracy and reliability) of modern automatic systems for recognizing emotional speech, based on the processing of audio and video information. It will also fill in the gaps in fundamental knowledge about the impact of different emotional states of a speaker on the accuracy of automatic speech recognition.
Scientific school: № NSh-17.2022.1.6 «Software for multimodal analysis of the behavior of participants in virtual communication»

Head: Karpov Alexey Anatolyevich
Period: 2022-2023
The goal of the project is the development and experimental research of mathematical and software for multimodal analysis of the behavior of participants in virtual communication, expressed in the manifestations of emotions and the degree of involvement of participants in communication through audio and video information (facial expressions and gestures), using artificial intelligence methods.