Article: Real-time sign language recognition based on video stream Journal: International Journal of Systems, Control and Communications (IJSCC) 2021 Vol.12 No.2 pp.158 - 174 Abstract: In this paper, a real-time Chinese sign language recognition system is investigated. This system can recognise deaf-mute Chinese sign language, and output the recognition results in real time through text. A Chinese sign language dataset is firstly created with a normal RGB camera, and the entire dataset contains 500,000 video samples. In order to improve the recognition accuracy of the system for real-time applications, three-dimensional convolutional neural network (3D-CNN) is investigated, combined with optical flow processing base on total variation regularisation and L1-norm robust (TV-L1). A two-step down-frame processing is employed to extract the equal number of key frames from each video stream, and finally put into 3D-CNN to extract feature vectors. Comparative studies are conducted with that of the hidden Markov model (HMM) and recurrent neural network (RNN), with 92.6% recognition accuracy on a dataset containing 1,000 vocabularies. A complete real-time sign language recognition system is finally developed and reported, which is composed of a human interaction interface, a motion detection module, a hand and head detection module, and a video acquisition mechanism. Experimental results verify the generalisation performance of the system in real-time. Inderscience Publishers - linking academia, business and industry through research

Title: Real-time sign language recognition based on video stream

Authors: Kai Zhao; Kejun Zhang; Yu Zhai; Daotong Wang; Jianbo Su

Addresses: Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China ' Shanghai Lingzhi High-Tech Corporation, Shanghai, 200240, China ' Shanghai Lingzhi High-Tech Corporation, Shanghai, 200240, China ' Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China ' Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China

Abstract: In this paper, a real-time Chinese sign language recognition system is investigated. This system can recognise deaf-mute Chinese sign language, and output the recognition results in real time through text. A Chinese sign language dataset is firstly created with a normal RGB camera, and the entire dataset contains 500,000 video samples. In order to improve the recognition accuracy of the system for real-time applications, three-dimensional convolutional neural network (3D-CNN) is investigated, combined with optical flow processing base on total variation regularisation and L1-norm robust (TV-L1). A two-step down-frame processing is employed to extract the equal number of key frames from each video stream, and finally put into 3D-CNN to extract feature vectors. Comparative studies are conducted with that of the hidden Markov model (HMM) and recurrent neural network (RNN), with 92.6% recognition accuracy on a dataset containing 1,000 vocabularies. A complete real-time sign language recognition system is finally developed and reported, which is composed of a human interaction interface, a motion detection module, a hand and head detection module, and a video acquisition mechanism. Experimental results verify the generalisation performance of the system in real-time.

Keywords: sign language recognition; three-dimensional convolutional neural network; 3D-CNN; TV-L1 optical flow; motion detection; hand and head detection.

DOI: 10.1504/IJSCC.2021.114616

International Journal of Systems, Control and Communications, 2021 Vol.12 No.2, pp.158 - 174

Received: 21 Jul 2020
Accepted: 24 Nov 2020
Published online: 28 Apr 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Real-time sign language recognition based on video stream

Keep up-to-date