Authors: Nitin Paharia; Rajesh Gupta; R.S. Jadon; S.K. Gupta
Addresses: Prestige Institute of Management, Gwalior, Madhya Pradesh, India ' Prestige Institute of Management, Gwalior, Madhya Pradesh, India ' Department of CSE and IT, MITS, Gwalior, Madhya Pradesh, India ' SOS Computer Science and Application, Jiwaji University, Gwalior, Madhya Pradesh, India
Abstract: Recognising human activity in video is a highly challenging and complex task because a video contains lots of information along with complex variations. Yoga-asana recognition is one of the instances of human activity recognition that gained attention in last decade across the globe. In this paper, we developed an appearance-based recognition system for yoga-asana in video. The system has been implemented using end-to-end deep learning pipeline that includes convolutional neural network (CNN) and bidirectional long short-term memory (LSTM) network. Firstly, each video is down-sampled to 20 frames. Thereafter, spatial features are extracted from each frame and then in turn passed on to bidirectional LSTM for learning sequential information. Finally, Softmax classifier is applied on spatio-temporal representation of video for assigning one of the seven yoga-asana labels to it. For this study, we also created a customised dataset of seven yoga-asana (Bhujangasana, CatCow, Trikonasana, Vrikshasana, Padmasana, Shavasana, and Tadasana). The system achieved average test accuracy of 96.67% on customised dataset in 20-fold cross validation which is comparative to related work.
Keywords: computer vision; convolutional neural network; CNN; long short-term memory; LSTM; human activity recognition; HAR; yoga-asana.
International Journal of Arts and Technology, 2021 Vol.13 No.3, pp.215 - 227
Received: 23 Feb 2021
Accepted: 10 Oct 2021
Published online: 04 Feb 2022 *