Authors: Yali Zhang
Addresses: School of Music, Henan Polytechnic, Zhengzhou 450046, China
Abstract: There are some problems in music emotion recognition, such as large root mean square error of recognition results and low Pearson correlation coefficient. The music signal is divided into frames by window function, the noise in the music signal is reduced by the time domain endpoint detection, and the music signal is preprocessed. The characteristics of pitch change, gene rise and fall, speech speed and gene slope were extracted by Mehr frequency cepstrum coefficient. According to the extracted music emotion features, the multi-feature fusion kernel function is constructed. Based on the fusion results, the multi-level SVM emotion recognition model is built with the support vector mechanism to realise music emotion recognition. Experimental results show that the root mean square error of the proposed method is always within the range of 0.02, and the highest Pearson correlation coefficient is about 0.9.
Keywords: music signal; feature extraction; feature fusion; emotion recognition; support vector machine.
International Journal of Arts and Technology, 2022 Vol.14 No.1, pp.10 - 23
Received: 13 Apr 2021
Accepted: 10 Oct 2021
Published online: 21 Apr 2022 *