Authors: S. Aranganayagi, K. Thangavel
Addresses: J.K.K. Nataraja College of Arts and Science, Komarapalayam – 638183, Namakkal District, Tamilnadu, India. ' Department of Computer Science, Periyar University, Salem-636011, Salem District, Tamilnadu, India
Abstract: K-modes is the scalable and efficient clustering algorithm to cluster the categorical data. The simple mismatching measure used in K-modes does not use the implicit relationship between the attribute values. This paper presents new weighted measures based on the domain of attribute values. The proposed measures were experimented with the datasets obtained from UCI data repository. External quality measures such as purity and F-measure is used to verify the efficiency of the clustering. The experimental results prove that the proposed measures are superior to original K-modes.
Keywords: clustering; categorical data; distance measures; scalability; attribute values domain; weighted measures; partitional; K-modes.
International Journal of Data Mining, Modelling and Management, 2010 Vol.2 No.3, pp.288 - 299
Published online: 04 Jun 2010 *Full-text access for editors Access for subscribers Purchase this article Comment on this article