Title: Correlation maximisation-based discretisation for supervised classification

 

Author: Qiusha Zhu; Lin Lin; Mei-Ling Shyu

 

Addresses:
Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33124, USA.
Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33124, USA.
Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33124, USA

 

Journal: Int. J. of Business Intelligence and Data Mining, 2012 Vol.7, No.1/2, pp.40 - 59

 

Abstract: This paper proposes a novel supervised discretisation algorithm based on Correlation Maximisation (CM) using Multiple Correspondence Analysis (MCA). MCA is an effective technique to capture the correlation between multiple variables. For each numeric feature, the proposed discretisation algorithm utilises MCA to measure the correlations between feature intervals/items and classes, and the set of cut-points yielding the maximum correlation is chosen as the discretisation scheme for that feature. Therefore, the discretised feature can not only produce a concise summarisation of the original numeric feature but also provide the maximum correlation information to predict class labels. Experiments are conducted by comparing to seven state-of-the-art supervised discretisation algorithms using six well-known classifiers on 19 UCI data sets. Experimental results demonstrate that the proposed discretisation algorithm can automatically generate a set of features (feature intervals) that produce the best classification results on average.

 

Keywords: discretisation; supervised classification; MCA; multiple correspondence analysis; correlation maximisation; feature intervals.

 

DOI: http://dx.doi.org/10.1504/IJBIDM.2012.048727

 

Available online 23 Aug 2012

 

 

Editors Full Text AccessAccess for SubscribersPurchase this articleComment on this article