Title: Sparse superlayered neural network-based multi-omics cancer subtype classification

Authors: Prasoon Joshi; Seokho Jeong; Taesung Park

Addresses: Department of Biotechnology and Biochemical Engineering, Indian Institute of Technology, Kharagpur, India ' Department of Statistics, Seoul National University, Gwanak-gu, Seoul, South Korea ' Department of Statistics, Seoul National University, Gwanak-gu, Seoul, South Korea

Abstract: Recently, targeted treatment of different subtypes of cancer has become of interest. To that end, we present a new deep neural network model, Sparse CRossmodal Superlayered Neural Network (SCR-SNN), for integrating high-dimensional RNA sequencing data with DNA methylation data. Our model consists of the following steps: (1) biomarker filtration; (2) biomarker selection, using a cross-modal, superlayered neural network with an L1 penalty; (3) integration of selected biomarkers from gene expression and DNA methylation data; and (4) prediction model building. For comparison, machine learning methods were used, alone and in combination. In these analyses, SCR-SNN was applied to gene expression and methylation data of lung adenocarcinoma and squamous cell lung carcinoma from The Cancer Genomic Atlas. The SCR-SNN model well classified lung cancer subtypes, using only a small number of markers. This approach represents a promising methodology for disease categorisation and diagnosis.

Keywords: classification; machine learning; multi-omics data; RNA-sequencing data.

DOI: 10.1504/IJDMB.2020.109500

International Journal of Data Mining and Bioinformatics, 2020 Vol.24 No.1, pp.58 - 73

Received: 27 Mar 2020
Accepted: 30 Mar 2020

Published online: 10 Sep 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article