Authors: J. Bagyamani; K. Thangavel; R. Rathipriya
Addresses: Department of Computer Science, Government Arts College 'Zion Villa', 181, Valluvar Nagar, Collectorate Post, Dharmapuri, Tamilnadu, 636705, India ' Department of Computer Science, Periyar University, Tamilnadu Salem, Tamilnadu, 636011, India ' Department of Computer Science, Periyar University, Tamilnadu Salem, Tamilnadu, 636011, India
Abstract: With the advent of the 'Age of Genomics' generation, accumulation and analysis of gene expression datasets is emerging. Biclustering, a relatively new unsupervised learning technique, allows the assignment of individual genes to multiple clusters. Our approach aims to detect significant biclusters from gene expression dataset using a heuristic algorithm called hybrid genetic biclustering algorithm (HGBA). Most of the biclustering algorithms use mean squared residue (MSR) based fitness function which could extract biclusters with shifting pattern alone. In this paper, a novel correlation-based fitness function has been defined in order to extract highly correlated biclusters. The proposed hybrid method HGBA has been designed by fusing genetic algorithm with simulated annealing in order to extract highly correlated biclusters with larger volume. Experiments conducted on benchmark datasets show that the results obtained by the proposed HGBA outperformed the results obtained by pure GA and greedy algorithms in terms of volume and homogeneity.
Keywords: data mining; bioinformatics; biclustering; gene expression data analysis; greedy approach; genetic algorithms; GAs; simulated annealing; hybrid approach; metaheuristics optimisation; correlation based fitness function; unsupervised learning.
International Journal of Data Mining, Modelling and Management, 2013 Vol.5 No.4, pp.333 - 350
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 18 Nov 2013 *