Title: Detecting audio adversarial examples for protecting speech-to-text transcription neural networks

Authors: Keiichi Tamura; Akitada Omagari; Hajime Ito; Shuichi Hashida

Addresses: Graduate School of Information Sciences, Hiroshima City University, 3-4-1, Ozuka-Higashi, Asa-Minami-Ku, Hiroshima 731-3194, Japan ' Soliton Systems K.K., 2-4-3, Shinjuku, Shinjuku-ku, Tokyo 160-0022, Japan ' Faculty of Information Sciences, Hiroshima City University, 3-4-1, Ozuka-Higashi, Asa-Minami-Ku, Hiroshima 731-3194, Japan ' Graduate School of Information Sciences, Hiroshima City University, 3-4-1, Ozuka-Higashi, Asa-Minami-Ku, Hiroshima 731-3194, Japan

Abstract: With the increasing use of deep learning techniques in real-world applications, their vulnerabilities have received significant attention from deep-learning researchers and practitioners. In particular, adversarial examples for deep neural networks and protection methods against them have been well-studied in recent years because they have serious vulnerabilities that threaten safety in the real-world. Audio adversarial examples, which are targeted attacks, are designed such that the deep neural network-based speech-to-text systems misunderstand input voice sound. In this study, we propose a new protection method against audio adversarial examples. The proposed protection method is based on a sandbox approach, where an input voice sound is checked in the system to determine if it is an audio adversarial example. To evaluate the proposed protection method, we used actual audio adversarial examples created on deep speech, which is a typical speech-to-text transcription neural network. The experimental results show that our protection method can detect audio adversarial examples with high accuracy.

Keywords: adversarial example; deep learning; computer security; data representation; speech-to-text; sandbox method.

DOI: 10.1504/IJCISTUDIES.2021.115427

International Journal of Computational Intelligence Studies, 2021 Vol.10 No.2/3, pp.161 - 180

Received: 22 Mar 2020
Accepted: 25 Jun 2020

Published online: 19 May 2021 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article