Title: Distance based knowledge retrieval through rule mining for complex biomarker recognition from tri-omics profiles
Authors: Saurav Mallik; Zhongming Zhao
Addresses: Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA ' Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
Abstract: Biomarker discovery from complex biomedical data has become an important topic to unveil the significant new disease signals for disease diagnosis and treatment during past two decades. The earlier methods were proposed on a single genomic profile, and most of them utilize a single minimum support/confidence/lift cutoff. To overcome these shortcomings, here, we developed a framework for identifying complex markers using shortest distance based rule mining from the tri-omics profiles (gene expression, methylation and protein-protein interaction). We applied our method to a high-grade soft-tissue sarcomas multi-omics dataset. The novel markers were {GRB2-, STAT3-}('-' and '+' denote decreased and increased gene activities, respectively), {STAT3+, TP53-, MAPK3+} and {STAT3+, FYN+, MAPK3+}. We showed the superiority of our method vs. others, as it generates fewer rules and lower mean of the shortest distance than others. Moreover, our method is useful to extract complex markers from tri-omics profiles for the complex disease.
Keywords: tri-omics data; multiple minimum supports/confidences/lifts; empirical Bayes test; weighted shortest distance; complex marker.
DOI: 10.1504/IJCBDD.2019.099758
International Journal of Computational Biology and Drug Design, 2019 Vol.12 No.2, pp.105 - 127
Received: 15 Mar 2018
Accepted: 27 May 2018
Published online: 21 May 2019 *