Authors: Wenxue Huang; Yuanyi Pan; Jianhong Wu
Addresses: Department of Mathematics, Shantou University, Shantou, Guangdong 515063, China. ' Department of Mathematics and Statistics, York University, Toronto, Ontario, M3J 1P3, Canada. ' Department of Mathematics and Statistics, York University, Toronto, Ontario, M3J 1P3, Canada
Abstract: Motivated by business interest of return on investment (ROI) in marketing, we develop a conceptual clustering algorithm for categorical data with a response variable based on a variation to Goodman-Kruskal measure. The key to this algorithm is an implicitly cost-effective dissimilarity measure derived from a probabilistic association rule between the response and the explanatory scenarios. Applications to a real dataset FAMEX96 illustrate how useful information can be mined from marketing data using this dissimilarity measure.
Keywords: categorical data; supervised clustering; dissimilarity measures; decisive rules; Goodman-Kruskal measure; return on investment; ROI; scenario association; target variable; clustering algorithms; marketing data; data mining.
International Journal of Data Mining, Modelling and Management, 2012 Vol.4 No.4, pp.334 - 360
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 18 Oct 2012 *