Title: An open speech resource for Tibetan multi-dialect and multitask recognition

Authors: Yue Zhao; Xiaona Xu; Jianjian Yue; Wei Song; Xiali Li; Licheng Wu; Qiang Ji

Addresses: School of Information and Engineering, Minzu University of China, Beijing, China ' School of Information and Engineering, Minzu University of China, Beijing, China ' School of Information and Engineering, Minzu University of China, Beijing, China ' School of Information and Engineering, Minzu University of China, Beijing, China ' School of Information and Engineering, Minzu University of China, Beijing, China ' School of Information and Engineering, Minzu University of China, Beijing, China ' Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180-3590, USA

Abstract: This paper introduces a Tibetan multi-dialect data resource for multitask speech research. It can be used for Tibetan multi-dialect speech recognition, Tibetan speaker recognition, Tibetan dialect identification, and Tibetan speech synthesis. The resource consists of 30 hours Lhasa-Ü-Tsang dialect; 8.7 hours Kham dialect, including 3.4 hours Yushu dialect, 3.3 hours Dege dialect and 2 hours Changdu dialect; 10 hours Amdo pastoral dialect. Other resources are also provided for Lhasa-Ü-Tsang dialect including phoneme set, pronunciation dictionary and the codes for constructing the Lhasa-Ü-Tsang speech recognition baseline system. Meanwhile, for Tibetan multi-dialect and multitask speech recognition, the codes and recognition results based on WaveNet-connectionist temporal classification (WaveNet-CTC) are provided. All the resources are free for researchers and publicly available, which effectively compensates for the shortage of public Tibetan multi-dialect speech resources in order to promote the development of Tibetan multi-dialect speech processing technology.

Keywords: Tibetan language; multi-dialect speech recognition; multitask learning; speech corpus.

DOI: 10.1504/IJCSE.2020.107351

International Journal of Computational Science and Engineering, 2020 Vol.22 No.2/3, pp.297 - 304

Received: 09 Apr 2019
Accepted: 04 Jul 2019

Published online: 18 May 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article