Title: Extended K-modes with new weighted measure based on the domains

Authors: S. Aranganayagi, K. Thangavel

Addresses: J.K.K. Nataraja College of Arts and Science, Komarapalayam – 638183, Namakkal District, Tamilnadu, India. ' Department of Computer Science, Periyar University, Salem-636011, Salem District, Tamilnadu, India

Abstract: K-modes is the scalable and efficient clustering algorithm to cluster the categorical data. The simple mismatching measure used in K-modes does not use the implicit relationship between the attribute values. This paper presents new weighted measures based on the domain of attribute values. The proposed measures were experimented with the datasets obtained from UCI data repository. External quality measures such as purity and F-measure is used to verify the efficiency of the clustering. The experimental results prove that the proposed measures are superior to original K-modes.

Keywords: clustering; categorical data; distance measures; scalability; attribute values domain; weighted measures; partitional; K-modes.

DOI: 10.1504/IJDMMM.2010.033538

International Journal of Data Mining, Modelling and Management, 2010 Vol.2 No.3, pp.288 - 299

Published online: 04 Jun 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article