Authors: Tahleen A. Rahman; Dhruba K. Bhattacharyya
Addresses: Department of Computer Science & Engineering, Tezpur University, Assam, India ' Department of Computer Science & Engineering, Tezpur University, Assam, India
Abstract: A number of clustering methods introduced for analysis of gene expression data for extracting potential relationships among the genes are studied and reported in this paper. An effective unsupervised method (TDAC) is proposed for simultaneous detection of outliers and biologically relevant co-expressed patterns. Effectiveness of TDAC is established in comparison to its other competing algorithms over six publicly available benchmark gene expression datasets in terms of both internal and external validity measures. Main attractions of TDAC are: (a) it does not require discretisation, (b) it is capable of identifying biologically relevant gene co-expressed patterns as well as outlier genes(s), (c) it is cost-effective in terms of time and space, (d) it does not require the number of clusters a priori, and (e) it is free from the restrictions of using any proximity measure.
Keywords: gene expression data; outlier detection; core genes; neighbour genes; connected genes; co-expressed gene patterns; attribute clustering; bioinformatics; outlier genes; discretisation.
International Journal of Bioinformatics Research and Applications, 2015 Vol.11 No.1, pp.45 - 71
Available online: 06 Feb 2015 *Full-text access for editors Access for subscribers Purchase this article Comment on this article