Speech and Multimodal Interfaces Laboratory

Publications

2016

Ivanko D., Karpov A. An Analysis of Perspectives for Using High-Speed Cameras in Processing Dynamic Video Information // SPIIRAS Proceedings, Vol. 44, No 1, 2016, pp. 98-113. (VAK; RSCI impact factor – 0,359).
Kipyatkova I., Karpov A. Variants of Deep Artificial Neural Networks for Speech Recognition Systems // SPIIRAS Proceedings, Vol. 49, No 6, 2016, pp. 80-103. (VAK; RSCI impact factor – 0,359).
Ronzhin Al., Vatamaniuk I., Zelezny M. Implementation of Face Recognition Methods as a First Step for Human Behaviour Analysis in Intelligent Room. In Proc. 24th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision WSCG-2016 (poster proc.), Pilsen, Czech Republic, CSRN 2603, 2016, pp. 61-64.
Saveliev A., Saitov S., Vatamaniuk I., Basov O., Shilov N. Neural Network System for Monitoring State of a Optical Telecommunication System. In Proc. International Conference on Next Generation Wired/Wireless Networking NEW2AN-2016. Springer LNCS, Vol. 9870, 2016, pp. 39-49.
Gruber I., Hlaváč M., Hrúz M., Železný M., Karpov A. An Analysis of Visual Faces Datasets. In Proc. 1st International Conference on Interactive Collaborative Robotics ICR-2016, Budapest, Hungary, Springer LNCS, Vol. 9812, 2016, pp. 18-26.
Verkhodanova V., Ronzhin Al., Kipyatkova I., Ivanko D., Karpov A., Železný M. HAVRUS Corpus: High-Speed Recordings of Audio-Visual Russian Speech. In Proc. SPECOM-2016, Budapest, Hungary, Springer LNCS, Vol. 9811, 2016, pp. 338-345.
Vatamaniuk I., Levonevskiy D., Saveliev A., Denisov A. Scenarios of Multimodal Information Navigation Services for Users in Cyberphysical Environment. In Proc. SPECOM-2016, Budapest, Hungary, Springer LNCS, Vol. 9811, 2016, pp. 588-595.
Kipyatkova I., Karpov A. DNN-Based Acoustic Modeling for Russian Speech Recognition Using Kaldi. In Proc. SPECOM-2016, Budapest, Hungary, Springer LNCS, Vol. 9811, 2016, pp. 246-253.
Verkhodanova V., Shapranov V. Detecting Filled Pauses and Lengthenings in Russian Spontaneous Speech Using SVM. In Proc. 18th International Conference on Speech and Computer SPECOM-2016, Budapest, Hungary, Springer LNCS, Vol. 9811, 2016, pp. 224-231.
Karpov A., Ronzhin Al., Kipyatkova I., Ronzhin A., Verkhodanova V., Saveliev A., Zelezny M. Bimodal Speech Recognition Fusing Audio-Visual Modalities. In Proc. 18th International Conference on Human-Computer Interaction HCII-2016, Toronto, Canada, Springer LNCS, Vol. 9732, 2016, pp. 170-179.
Kaya H., Karpov A., Salah A. Robust Acoustic Emotion Recognition based on Cascaded Normalization and Extreme Learning Machines. In Proc. 13th International Symposium on Neural Networks ISNN-2016, St. Petersburg, Russia, Springer LNCS, Vol. 9719, 2016, pp. 115-123.
Kipyatkova I., Karpov A. Language Models with RNNs for Rescoring Hypotheses of Russian ASR. In Proc. 13th International Symposium on Neural Networks ISNN-2016, St. Petersburg, Russia, Springer LNCS, Vol. 9719, 2016, pp. 418-425. (Scopus SJR – 0,252).
Kaya H., Karpov A. Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks. In Proc. INTERSPEECH-2016, San Francisco, USA, 2016, pp. 2046-2050. (Scopus SJR – 0,275).
Karpov A., Kipyatkova I., Zelezny M. Automatic Technologies for Processing Spoken Sign Languages. In Proc. SLTU-2016, Indonesia. Procedia Computer Science. Elsevier, Vol. 81, 2016, pp. 201-207. (Scopus SJR – 0,314).
Verkhodanova V., Shapranov V. Experiments on detection of voiced hesitations in Russian spontaneous speech // Journal of Electrical and Computer Engineering. Hindawi, USA, Volume 2016, 2016, Article ID 2013658. (Scopus SJR – 0,225).