Title: Mining hub genes from RNA-Seq gene expression data using biclustering algorithm

Authors: Ankush Maind; Shital Raut

Addresses: Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, Maharashtra, India ' Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, Maharashtra, India

Abstract: Biclustering is a popularly used data mining technique for the analysis of gene expression data. Recently, multiple biclustering algorithms have been designed for finding co-expressed genes from the microarray gene expression data. Microarray data has some drawbacks. To overcome the drawbacks of microarray data, RNA-Seq technology was introduced. RNA-Seq technology is the advanced high throughput technique. In this paper, we have introduced a new approach for identifying hub genes from the RNA-Seq data using biclustering algorithm. For mining biclusters, efficient 'runibic' biclustering algorithm is used. The 'runibic' algorithm performs well on various issues such as overlapping, noise, stable output, accuracy, large-scale data, and biological significance. For each significant bicluster, we have constructed a gene co-expression network (GCN). Further, each constructed GCN used for identifying hub genes. The identified hub genes are specific to the subsets of experimental conditions. The extracted hub genes can be useful in the several clinical applications as prognostic or diagnostic markers of the diseases.

Keywords: biclustering; RNA-Seq data; data mining; bioinformatics; gene co-expression network; hub gene; biomarker.

DOI: 10.1504/IJDMB.2019.099728

International Journal of Data Mining and Bioinformatics, 2019 Vol.22 No.2, pp.171 - 193

Available online: 18 May 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article