Title: Concepts of relative sample outlier (RSO) and weighted sample similarity (WSS) for improving performance of clustering genes: co-function and co-regulation

Authors: Anindya Bhattacharya; Nirmalya Chowdhury; Rajat K. De

Addresses: Department of Microbiology, Immunology and Biochemistry, Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, TN 38163, USA ' Department of Computer Science and Engineering, Jadavpur University, Kolkata 700032, West Bengal, India ' Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700108, West Bengal, India

Abstract: Performance of clustering algorithms is largely dependent on selected similarity measure. Efficiency in handling outliers is a major contributor to the success of a similarity measure. Better the ability of similarity measure in measuring similarity between genes in the presence of outliers, better will be the performance of the clustering algorithm in forming biologically relevant groups of genes. In the present article, we discuss the problem of handling outliers with different existing similarity measures and introduce the concepts of Relative Sample Outlier (RSO). We formulate new similarity, called Weighted Sample Similarity (WSS), incorporated in Euclidean distance and Pearson correlation coefficient and then use them in various clustering and biclustering algorithms to group different gene expression profiles. Our results suggest that WSS improves performance, in terms of finding biologically relevant groups of genes, of all the considered clustering algorithms.

Keywords: similarity measures; z-score; P-value; functional enrichment; transcription factors; relative sample outlier; RSO; weighted sample similarity; WSS; clustering genes; co-function; co-regulation; gene expression profiles; clustering algorithms; bioinformatics.

DOI: 10.1504/IJDMB.2015.067322

International Journal of Data Mining and Bioinformatics, 2015 Vol.11 No.3, pp.314 - 330

Received: 15 Mar 2012
Accepted: 21 Feb 2013

Published online: 05 Feb 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article