Title: Cross-corpus classification of affective speech

Authors: Imen Trabelsi; Med Salim Bouhlel

Addresses: Sciences and Technologies of Image and Telecommunications, SETIT, Sfax University, Sfax, Tunisia ' Sciences and Technologies of Image and Telecommunications, SETIT, Sfax University, Sfax, Tunisia

Abstract: Automatic speech emotion recognition still has to overcome several obstacles before it can be employed in realistic situations. One of these barriers is the lack of suitable training data, both in quantity and quality. The aim of this study is to investigate the effect of cross-corpus data on automatic classification of emotional speech. In this work, features vectors, constituted by the Mel frequency cepstral coefficients (MFCC) extracted from the speech signal are used to train the support vector machines (SVM) and Gaussian mixture models (GMM). The research describes the evaluation of three different emotional databases from three different languages (English, Polish and German) following a three cross-corpus strategies. In the intra-corpus scenario, the accuracies were found to vary widely between 70% and 87%. In the inter-corpus scenario, the obtained average recall is 70.87%. The accuracies of the cross-corpus scenario were found to be below to 50%.

Keywords: cross-corpus strategies; speech emotion recognition; Gaussian mixture models; GMM; support vector machines; SVM; Mel frequency cepstral coefficients; MFCC.

DOI: 10.1504/IJAIP.2022.124312

International Journal of Advanced Intelligence Paradigms, 2022 Vol.22 No.3/4, pp.229 - 239

Received: 27 Apr 2017
Accepted: 27 Apr 2017

Published online: 22 Jul 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article