Speech and Multimodal Interfaces Laboratory



Kipyatkova I. Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling. In Proc. 20th International Conference on Speech and Computer SPECOM-2018, Leipzig, Germany, Springer, LNAI vol. 11096, 2018, pp. 291-300.
Hlaváč M., Gruber I., Železný M., Karpov A. LipsID using 3D Convolutional Neural Network. In Proc. 20th International Conference on Speech and Computer SPECOM-2018, Leipzig, Germany, Springer, LNAI vol. 11096, 2018, pp. 209-214.
Velichko A., Budkov V., Kagirov I., Karpov A. Comparative Analysis of Classification Methods for Automatic Deception Detection in Speech. In Proc. 20th International Conference on Speech and Computer SPECOM-2018, Leipzig, Germany, Springer, LNAI vol. 11096, 2018, pp. 737-746.
Fedotov D., Kaya H., Karpov A. Context Modeling for Cross-Corpus Dimensional Acoustic Emotion Recognition: Challenges and Mixup. In Proc. 20th International Conference on Speech and Computer SPECOM-2018, Leipzig, Germany, Springer, LNAI vol. 11096, 2018, pp. 155-165.
Kaya H., Fedotov D., Yesilkanat A., Verkholyak O., Zhang Y., Karpov A. LSTM based Cross-corpus and Cross-task Acoustic Emotion Recognition. In Proc. 19th International Conference INTERSPEECH-2018, Hyderabad, India, ISCA, 2018, pp. 521-525.
Vatamaniuk I.V., Budkov V.Y., Kipyatkova I.S., Karpov A.A. Methods and Algorithms of Audio-Video Signal Processing for Analysis of Indoor Human Activity. In: Favorskaya M., Jain L. (eds.) Computer Vision in Control Systems-4. Intelligent Systems Reference Library, vol. 136. Springer, 2018, pp. 139-173.
Verkhodanova V.O., Shapranov V.V., Kipyatkova I.S., Karpov A.A. Automatic detection of vocalized hesitations in Russian speech. Voprosy Jazykoznanija, 2018, No. 6, pp. 104–118. (in Russian)
Ivanko D.V., Fedotov D.V., Karpov A. A. Accuracy increase for automatic visual Russian speech recognition: viseme classes optimization. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2018, vol. 18, no. 2, pp. 346–349
Markovnikov N.M., Kipyatkova I.S. An Analytic Survey of End-to-End Speech Recognition Systems // SPIIRAS Proceedings. 2018. Issue 3(58). pp. 77-110.
Karpov A., Mporas I. Speech Communication Integrated with Other Modalities (Editorial) // Journal on Multimodal User Interfaces, Springer, Vol. 12, № 4, 2018, pp. 271-272.
Karpov A.A., Yusupov R.M. Multimodal Interfaces of Human-Computer Interaction // Herald of the Russian Academy of Sciences, Springer, Vol. 88, No. 1, 2018, pp. 67-74.
Ivanko D., Karpov A., Fedotov D., Kipyatkova I., Ryumin D., Ivanko Dm., Minker W., Zelezny M. Multimodal Speech Recognition: Increasing Accuracy using High Speed Video Data // Journal on Multimodal User Interfaces, Springer, Vol. 12, № 4, 2018, pp. 319-328.
Karpov A.A., Yusupov R.M. Multimodal Interfaces of Human-Computer Interaction // Herald of the Russian Academy of Sciences, Springer, Vol. 88, No. 2, 2018, pp. 146-155.
Kaya H., Karpov A. Efficient and Effective Feature Normalization Strategies for Cross-Corpus Acoustic Emotion Recognition // Neurocomputing. Elsevier, Vol. 275, 2018, pp. 1028-1034.


Kipyatkova I. Development and research of neural network hybrid acoustic models for the Russian speech recognition system. Materials of the XXII St. Petersburg Assembly of Young Scientists and Specialists, 2017, p. 201.