Title: Distance based knowledge retrieval through rule mining for complex biomarker recognition from tri-omics profiles

Authors: Saurav Mallik; Zhongming Zhao

Addresses: Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA ' Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA

Abstract: Biomarker discovery from complex biomedical data has become an important topic to unveil the significant new disease signals for disease diagnosis and treatment during past two decades. The earlier methods were proposed on a single genomic profile, and most of them utilize a single minimum support/confidence/lift cutoff. To overcome these shortcomings, here, we developed a framework for identifying complex markers using shortest distance based rule mining from the tri-omics profiles (gene expression, methylation and protein-protein interaction). We applied our method to a high-grade soft-tissue sarcomas multi-omics dataset. The novel markers were {GRB2-, STAT3-}('-' and '+' denote decreased and increased gene activities, respectively), {STAT3+, TP53-, MAPK3+} and {STAT3+, FYN+, MAPK3+}. We showed the superiority of our method vs. others, as it generates fewer rules and lower mean of the shortest distance than others. Moreover, our method is useful to extract complex markers from tri-omics profiles for the complex disease.

Keywords: tri-omics data; multiple minimum supports/confidences/lifts; empirical Bayes test; weighted shortest distance; complex marker.

DOI: 10.1504/IJCBDD.2019.099758

International Journal of Computational Biology and Drug Design, 2019 Vol.12 No.2, pp.105 - 127

Received: 15 Mar 2018
Accepted: 27 May 2018

Published online: 21 May 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article