Speech and Multimodal Interfaces Laboratory

Intelligent system for multimodal recognition of human's cognitive disorders

Intelligent system for multimodal recognition of human's cognitive disorders

This RSF project is aimed at solving the actual problem of multimodal recognition of people's cognitive disorders by analyzing their conversational speech and visual manifestations of behavior using modern methods of digital signal processing and deep machine learning. The goal of the project is to develop and research an intelligent computer system for multimodal analysis of human behavior in order to recognize cognitive disorders (in diseases such as Alzheimer's and Parkinson's diseases, dementia, depression, etc.) based on audio, video and text data to improve the efficiency and speed of non-contact diagnosis. Research on the automatic diagnosis of speech and multimodal manifestations of cognitive disorders is a highly sought-after interdisciplinary field of application of the latest information technologies and artificial intelligence in healthcare and wellbeing of people. This is explained by the prospects of the usage of artificial intelligence methods for timely, remote and equipment-light medical diagnostics, which is especially important for people who may be limited in movement due to age and health conditions, as well as due to their remote living areas and the inability to have an in-person appointment with a specialist. Such research must meet high standards of accuracy in recognizing disorders from users and specialists, as well as ethical requirements, which is why the development of new effective, reliable and explainable AI methods for interpreting decisions made is of particular relevance and significance.

During the project, it is planned to develop and research new models and improve existing ones, as well as methods, algorithms, and software solutions for comprehensive multimodal recognition of human cognitive disorders. In particular, pressing problems related to data augmentation of training audiovisual data in various languages (English, Greek, etc.) will be solved. The possibilities of obtaining new language-independent sets of features and their application to Russian-language data using expert, neural network approaches, and large language models will be explored. Approaches to machine classification (presence or absence of pathology) or regression (determining the severity of the disease) of cognitive disorders in question, as well as approaches to ensuring explainability of neural network features and probabilistic models of cognitive disorders, will also be investigated. It is also planned to investigate approaches for machine classification (presence or absence of pathology) or regression (determining the severity of the disease) of cognitive disorders in question, as well as approaches to ensure explainability of neural network features and probabilistic models of cognitive disorders. The main result of this project should be a prototype of an intelligent expert system for automatic recognition of human cognitive disorders based on comprehensive multimodal analysis of acoustic characteristics of voice, visual characteristics of facial expressions and gestures of a person, as well as linguistic components of his speech statements. It is expected that the results obtained will meet modern requirements and standards in this field and be at the advanced world level. The practical, scientific and technical significance of the tasks set in the project is confirmed by the high demand for the technologies being developed in the market of speech and multimodal expert technologies for healthcare and human well-being, as well as by a large number of foreign scientific publications devoted to this problem in leading scientific journals and proceedings of international conferences. The proposed system will be unique in its kind due to the possibility of comprehensive multimodal determination of the considered cognitive disorders in speech, the use of new sets of analyzed features, as well as the application of multi-level methods taking into account interdependencies between the considered cognitive disorders.

Project's head
Number
N 25-11-00319
Period
2025-2027
Financing
Russian Science Foundation