Speech and Multimodal Interfaces Laboratory

We congratulate our laboratory on winning the competition for the right to receive grants for young PhDs, young scientists of universities, industry and academic institutions in 2023.

We were awarded grants-subsidies from Committee on Science and Higher Education (CSHE) of St. Petersburg Government for young PhDs and for young scientists:

  • Ryumin D.A. «DeafMed: Mathematical tools and intelligent system to facilitate communication between healthcare professionals and patients experiencing hearing impairments or difficulties»

    The main goal of this project is to develop an intelligent medical system called "DeafMed" using advanced mathematical tools such as neural network models, methods and algorithms. The aim is to improve communication between medical professionals and patients with hearing impairment. This scientific project has the potential to be implemented throughout Russia and contribute to the advancement of the medical field. The use of assistive technologies for gesture recognition makes it particularly significant in improving communication and ensuring access to quality medical care for all patients, including those with hearing impairments.


  • Ivanko D.V. «Design and research of an automatic system for recognizing user's emotional speech based on audio-visual information processing»

    Digital personal assistants, dialog systems and mobile speech recognition systems are becoming more and more common with the development of machine learning and artificial intelligence technologies. In the course of this project, fundamentally new scientific results and solutions for audiovisual recognition of emotionally colored speech were obtained, which will have a significant impact on the further development of Russian speech technologies. Possible consumers of the expected results may be domestic state and commercial companies. The validity and reliability of the obtained results are confirmed by the results of experimental research, with the presentation of the main results in a cycle of scientific publications. The level of research is in accordance with the existing world standards. As a result of the research project, the optimal architecture of the neural network model was selected, its training was performed and a method for recognizing emotionally colored audiovisual speech was developed.


  • Markitantov M.V. «Development and research of new audiovisual methods for emotion recognition in challenging conditions using neural network technologies based on cross-modal attention»

    Understanding and recognizing emotions is fundamental to human communication, contributing to empathy, social bonding, and effective decision making. In real-world conditions, a person's face may be covered by a mask or various items of clothing, making it difficult for systems to correctly recognize different human states, including emotions, as proven by the COVID-19 pandemic. The use of masks, which usually cover the mouth and part of the nose, reduces the range of speech and facial characteristics that can be observed and analyzed. Interpreting non-verbal cues and recognizing the emotions, thoughts and intentions of others becomes difficult under these conditions. The aim of the research is to develop a novel bimodal method for emotion recognition under partially masked face conditions using neural network models based on cross-modal attention. As a result of the project, a high-performance system of automatic audiovisual recognition of emotions in challenging conditions based on artificial neural networks with cross-modal attention was developed.


  • Ryumina E.V. «Research and development of mathematical tools and an intelligent system for automatic non-verbal individual personality traits assessment»

    The project introduces mathematical tools (neural network models and methods) and an intelligent system for automatic non-verbal personality traits assessment. The developed mathematical tools and intelligent system analyze audio-visual information obtained from individuals to assess five personality traits: openness to experience, conscientiousness, extraversion, agreeableness, and emotional stability. The distinctive feature of the developed mathematical tools and intelligent system is the use of neural network technologies that integrate audio-visual features constructed on various principles (hand-crafted and deep features). The mathematical tools and intelligent system developed within the project will enhance human-machine interaction by integrating them into intelligent personnel management systems, psychological health monitoring, educational trajectory planning, and more.


  • Dvoynikova A.A. «Development of an approach to multitasking classification of various psychological states of a person»

    The project proposes an approach to multitasking classification of various psychological states of a person. The aim of this project is to increase the accuracy of recognition of various psychological states of a person by applying approaches to multitasking classification. During the project, an analytical review of information support on the subject of the study will be performed, databases for multitasking classification of various psychological states will be preprocessed, an approach for multitasking classification of various psychological states will be developed, the developed approach will be tested on selected databases. The scientific novelty of the project results lies in the fact that the developed system recognizes not one psychological state of a person, but several at once, which is a confirmation of theoretical and practical research in the field of psychology. The practical value of the research lies in the fact that the approach being developed makes it possible to increase the accuracy of automatic recognition systems for various psychological states. Systems for recognizing human psychological states are applicable in various fields: education, marketing, monitoring, etc.