Authors: S. Selva Kumar; H. Hannah Inbarani
Addresses: Department of Computer Science, P.M.P. College of Arts and Science, Periyar University, Dharmapuri – 636705, India ' Department of Computer Science, Periyar University, Salem-11, India
Abstract: Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. A gene cluster is a set of two or more genes that serve to encode for the same or similar products. Gene clustering is the process of grouping related genes in the same cluster as at the foundation of different genomic studies that aim at analysing the function of genes. Several advanced techniques have been proposed for data clustering and many of them have been applied to gene expression data, with partial success. The goal of gene clustering is to identify important genes and perform cluster discovery on samples. This paper reviews three of the most representative off-line clustering techniques: fuzzy C-means clustering, hierarchical clustering, and mixed C-means clustering. These techniques are implemented and tested against a brain tumour gene expression dataset. The performance of the three techniques is compared based on 'goodness of clustering' evaluation measures and mixed C-means show best performance than the other two clustering techniques for the brain tumour gene expression data.
Keywords: fuzzy C-means clustering; FCM; hierarchical clustering; mixed C-means clustering; brain tumours; gene clustering; gene expression data; data mining; biomedicine; data clustering.
International Journal of Data Analysis Techniques and Strategies, 2013 Vol.5 No.2, pp.214 - 228
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 04 May 2013 *