Title: Multi-kernel LS-SVM based integration bio-clinical data analysis and application to ovarian cancer

Authors: Jaya Thomas; Lee Sael

Addresses: Department of Computer Science, SUNY Korea, Incheon, South Korea ' Department of Computer Science, SUNY Korea, Incheon, South Korea; Department of Computer Science, Stony Brook University, NY 11794, USA

Abstract: The medical research facilitates to acquire a diverse data types from the same individual for a particular cancer. Major challenge is how to integratively analyse the multiple data types. In this paper, we introduce a multiple kernel based pipeline for integrative analysis of four genomic data and a set of clinical data. In the pipeline, multiple-kernel is generated from the weighted sum of individual kernels and is used to stratify patients and predict clinical outcomes. We apply the pipeline on ovarian cancer data from TCGA and examine intra similarities of clinical factors of each subtype and calculate log-rank statistics to verify how well they cluster. We also examined the power of molecular and clinical data in predicting dichotomised overall survival data and tumour grade. It was observed that the integration of various data types yields better stratification and higher prediction accuracy as compared to using individual data types.

Keywords: integrative analysis; least squares multi-kernel; bio-clinical data; ovarian cancer; LS-SVM; kernel k-means; heterogeneous data; cancer stratification; prognostic prediction.

DOI: 10.1504/IJDMB.2017.089281

International Journal of Data Mining and Bioinformatics, 2017 Vol.19 No.2, pp.150 - 167

Received: 30 Aug 2017
Accepted: 05 Sep 2017

Published online: 11 Jan 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article