Speech and Multimodal Interfaces Laboratory

Analysis of Voice and Facial Features of a Human in a Mask

Analysis of Voice and Facial Features of a Human in a Mask

Due to an unexpected occurrence and the current rapid global spread of the coronavirus COVID-19 pandemic, the most urgent task is to monitor the level of security of individuals and the whole society in the new world of a social distance and a "mask" culture. In recent years, the periodic wearing of protective face masks in public places has become absolutely familiar and commonplace for many residents of densely populated Asian countries (Japan, Singapore, Malaysia, China, etc.). Therefore, they were protected from people with possible respiratory diseases, air pollution and allergens. This mask culture and strict observance of quarantine requirements by the population of these Asian countries became the main guarantee of the extinction of COVID-19 spread. In recent months, masks have become an element of European culture and even fashion, firmly entering our dress code. Now and in the coming years there is an urgent need for automated verification of the presence of a protective mask for people who are in public places or are in contact with infected people or those at risk of infection. As part of this RFBR project, it is proposed to develop and research a new software system for automatic bimodal analysis of voice and facial characteristics of a masked person.

A number of fundamentally new scientific and technical results will be obtained during 2-year research project: (1) new infoware - a bimodal Russian-language database (corpus) containing multi-angle images of people's faces in various variations of protective masks, as well as audio recordings of dozens native speakers of the Russian language in masks, including disposable medical masks of various densities, reusable fabric masks of various colors with and without drawings, special respirators and other means of protecting the mucous surfaces of the face; (2) new methods and models for the automatic analysis of people's voice characteristics by speech, including the presence of a protective mask when speaking, detection of cough, the likelihood of a respiratory disease, etc.; (3) new methods and models for analyzing the facial characteristics of people by video data, including detection of the presence or absence of a protective mask on the face, biometric characteristics of the open part of the face (upper part of the head) of a person; (4) a prototype software system for automatic bimodal analysis of voice and facial characteristics of a person in a mask.

The results of these studies based on modern artificial intelligence technologies can be directly used to combat the spread of viral epidemics (coronaviruses, including COVID-19, flu viruses, and other very pathogenic types of viruses in the future) both in Russia and around in the world.

Project's head
N 20-04-60529-viruses
Russian Foundation for Basic Research (RFBR)