Title: Challenges in biological literature mining for online discovery of molecular interaction pathways

Authors: See-Kiong Ng, Soon-Heng Tan

Addresses: Knowledge Discovery Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore. ' Knowledge Discovery Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore

Abstract: The new challenge in post-genome research is to unravel the underlying interplay of bio-molecules as informative molecular interaction pathways. However, much of the molecular interaction information is currently contained in scientific journals. Despite previous accomplishments from the text mining community and the increasing research activities in biological text mining, biologists are still expending great efforts by laborious hand-curation of the scientific literature to create quality online databases of bio-molecules and their interactions. In this paper, we examine why this is the case by reviewing the various challenges in mining biological literature for bio-molecular interaction pathways. We propose a methodology for training and evaluating biological literature-based data mining applications with annotated biological review papers. By laying out the various computational challenges, we hope that a road map can be furnished for the text-based data mining community to collectively solve this complex but increasingly important data mining task in bio-informatics.

Keywords: biological literature mining; bio-molecular interaction; pathway discovery; data mining; bio-informatics.

DOI: 10.1504/IJCAT.2006.011997

International Journal of Computer Applications in Technology, 2006 Vol.27 No.4, pp.259 - 269

Published online: 08 Jan 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article