Title: Discovering implicit associations among critical biological entities

Authors: Kazuhiro Seki, Javed Mostafa

Addresses: Organization of Advanced Science and Technology, Kobe University, 1-1 Rokkodai, Nada, Kobe 657-8501, Japan. ' Laboratory of Applied Informatics Research, University of North Carolina at Chapel Hill, 216 Lenoir Drive, CB#3360, 100 Manning Hall, Chapel Hill, NC 27599-3360, USA

Abstract: We propose an approach to predicting implicit gene-disease associations based on the inference network, whereby genes and diseases are represented as nodes and are connected via two types of intermediate nodes: gene functions and phenotypes. To estimate the probabilities involved in the model, two learning schemes are compared; one baseline using co-annotations of keywords and the other taking advantage of free text. Additionally, we explore the use of domain ontologies to complement data sparseness and examine the impact of full text documents. The validity of the proposed framework is demonstrated on the benchmark data set created from real-world data.

Keywords: TDM; text data mining; literature-based discovery; information retrieval; inference networks; free text; full text; domain ontology; gene ontology; MeSH; bioinformatics; critical entities; implicit associations; biological entities; disease associations; gene-disease associations; gene functions; phenotypes.

DOI: 10.1504/IJDMB.2009.024846

International Journal of Data Mining and Bioinformatics, 2009 Vol.3 No.2, pp.105 - 123

Published online: 01 May 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article