Title: A novel point density based validity index for clustering gene expression datasets

Authors: M. Arif Wani; Romana Riyaz

Addresses: Postgraduate Department of Computer Science, University of Kashmir, Jammu & Kashmir, India ' Postgraduate Department of Computer Science, University of Kashmir, Jammu & Kashmir, India

Abstract: Elucidating the patterns hidden in gene expression data offers an opportunity for identifying co-expressed genes and biologically relevant grouping of genes. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the microarray data. A first step toward addressing this challenge is the use of clustering techniques. Validation of results obtained from a clustering algorithm is an important part of the clustering process. In this paper, we propose a new cluster validity index (ARPoints index) for the purpose of cluster validation. A new approach to determine the compactness measure and distinctness measure of clusters is presented. We revisit commonly known indices and conduct a thorough comparison of these indices with the proposed index and provide a summary of performance evaluation of different indices. Experimental results show that the proposed index performs better than the commonly known cluster validity indices.

Keywords: clustering; cluster validation; compactness measure of clusters; distinctness measure of clusters; clustering gene data; gene expression analysis.

DOI: 10.1504/IJDMB.2017.084027

International Journal of Data Mining and Bioinformatics, 2017 Vol.17 No.1, pp.66 - 84

Received: 05 May 2016
Accepted: 08 Mar 2017

Published online: 03 May 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article