Title: Gene selection and classification combining information gain ratio with fruit fly optimisation algorithm for single-cell RNA-seq data

Authors: Jie Zhang; Junhong Feng; Xiani Yang; Jianming Liu

Addresses: School of Computer Science and Engineering, Guangxi Colleges and Universities Key Lab of Complex System Optimization and Big Data Processing, Yulin Normal University, Yulin 537000, Guangxi, China ' School of Computer Science and Engineering, Guangxi Colleges and Universities Key Lab of Complex System Optimization and Big Data Processing, Yulin Normal University, Yulin 537000, Guangxi, China ' School of Computer Science and Engineering, Guangxi Colleges and Universities Key Lab of Complex System Optimization and Big Data Processing, Yulin Normal University, Yulin 537000, Guangxi, China ' Business School, Yulin Normal University, Yulin 537000, Guangxi, China

Abstract: There are a wide range of genes in single-cell data, but some are not beneficial to classification. In order to eliminate these redundant genes and select beneficial genes, this study first utilises the information gain (IG) to select some genes coarsely, then uses the modified fruit fly optimisation algorithm (FOA) to choose the relevant genes refinedly from the subsets after performing IG. The proposed algorithm makes full use of respective advantages of the IG and FOA, and is abbreviated as IGFOA. The proposed algorithm is implemented on multiple scRNA-seq datasets with various numbers of cells and genes, and the obtained results validate that the IGFOA can select effectively some superior genes and acquire good classification performance.

Keywords: single-cell; scRNA-seq; gene selection; fruit fly optimisation; information gain.

DOI: 10.1504/IJCSE.2021.118098

International Journal of Computational Science and Engineering, 2021 Vol.24 No.5, pp.495 - 504

Received: 14 Aug 2020
Accepted: 09 Jan 2021

Published online: 12 Oct 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article