Title: IQRAM: a high dimensional data clustering technique

Authors: Dharmveer Singh Rajput; Pramod Kumar Singh; Mahua Bhattacharya

Addresses: ABV, Indian Institute of Information Technology and Management Gwalior, Morena Link Road, Gwalior – 474015, Madhya Pradesh, India ' ABV, Indian Institute of Information Technology and Management Gwalior, Morena Link Road, Gwalior – 474015, Madhya Pradesh, India ' ABV, Indian Institute of Information Technology and Management Gwalior, Morena Link Road, Gwalior – 474015, Madhya Pradesh, India

Abstract: Clustering is a process of partitioning data objects into different groups according to some similarity or dissimilarity measure, e.g., distance criterion. The distance criterion fails to group the objects as all the objects are almost equidistant in high dimensional dataset, hence the distance criterion becomes meaningless. In the literature, numerous clustering algorithms are presented for clustering high dimensional dataset, which select relevant dimensions in high dimensional dataset and perform clustering of the objects on the selected dimensions. As these clustering algorithms produce different clustering results on the same dataset, there is confusion in the selection of clustering algorithm for better clustering of high dimensional dataset. In this paper, we present a comparative study of conventional feature selection based clustering algorithms and propose a new feature selection based clustering method IQRAM (inter quartile range and median based clustering of high dimensional dataset) for clustering high dimensional dataset. We perform our experiments on two real datasets and analyse the clustering results using five well-known clustering quality measures and student's t-test. The qualitative results show that IQRAM outperform ten competitive clustering algorithms.

Keywords: data clustering; high dimensional datasets; dimension reduction; feature extraction; feature selection; data mining; clustering algorithms.

DOI: 10.1504/IJKEDM.2012.051237

International Journal of Knowledge Engineering and Data Mining, 2012 Vol.2 No.2/3, pp.117 - 136

Published online: 13 Sep 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article