Authors: Michael K. Ng, Liping Jing
Addresses: Department of Mathematics, Centre for Mathematical Imaging and Vision, Hong Kong Baptist University, Kowloon Tong, Hong Kong. ' Department of Mathematics, Centre for Mathematical Imaging and Vision, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Abstract: This correspondence describes extensions to the fuzzy k-modes algorithm for clustering categorical data. We modify a simple matching dissimilarity measure for categorical objects, which allows the use of the fuzzy k-modes paradigm to obtain a cluster with strong intra-similarity, and to efficiently cluster large categorical data sets. We derive rigorously the updating formula of the fuzzy k-modes clustering algorithm with the new dissimilarity measure, and the convergence of the algorithm under the optimisation framework. Experimental results are presented to illustrate that the effectiveness of the new fuzzy k modes algorithm is better than those of the other existing k-modes algorithms.
Keywords: categorical data; clustering; data mining; fuzzy k-modes algorithm; dissimilarity measures.
International Journal of Granular Computing, Rough Sets and Intelligent Systems, 2009 Vol.1 No.1, pp.105 - 119
Published online: 24 Jun 2009 *Full-text access for editors Access for subscribers Purchase this article Comment on this article