Title: Clustering mixed data using neighbourhood rough sets

Authors: Sharmila Banu Kather; B.K. Tripathy

Addresses: School of Computer Science and Engineering, VIT University, Vellore, Tamilnadu, India ' School of Computer Science and Engineering, VIT University, Vellore, Tamilnadu, India

Abstract: Data in varied nature and huge quantities are being generated every day. They range from tabulated, structured and semi-structured as well as numerical or categorical in terms of attributes. Data pre-processing presents data in a favourable format to apply analytics algorithm and derive knowledge therein. Data analytics has revolutionised millennial mankind unwinding the knowledge and patterns mined from data. Clustering is an unsupervised learning pattern which has popular algorithms based on distance, density, dimensions and other functions. These algorithms are operational on numerical attributes and special algorithms for data involving categorical features are also reported. In this paper we propose a straight forward way of clustering data involving both numerical and categorical features based on neighbourhood rough sets. It does not include calculation of any extra parameters like entropy, saliency, dependency or call for discretisation of data. Hence its complexity is lesser than algorithms proposed for categorical or mixed data and offers better efficiency.

Keywords: clustering; mixed; categorical and numerical data; continuous data; rough sets; neighbourhood rough sets; granulation.

DOI: 10.1504/IJAIP.2020.104103

International Journal of Advanced Intelligence Paradigms, 2020 Vol.15 No.1, pp.1 - 16

Received: 18 Jun 2016
Accepted: 31 Aug 2016

Published online: 14 Dec 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article