Title: Enhanced semantic access to the protein engineering literature using ontologies populated by text mining

Authors: Rene Witte, Thomas Kappler, Christopher J.O. Baker

Addresses: Institute for Program Structures and Data Organization (IPD), Faculty of Informatics, Universitat Karlsruhe (TH), Germany. ' Institute for Program Structures and Data Organization (IPD), Faculty of Informatics, Universitat Karlsruhe (TH), Germany. ' Data Mining Department, Computing Division, Institute for Infocomm Research (I2R), Singapore

Abstract: The biomedical literature is growing at an ever-increasing rate, which pronounces the need to support scientists with advanced, automated means of accessing knowledge. We investigate a novel approach employing description logics (DL)-based queries made to formal ontologies that have been created using the results of text mining full-text research papers. In this paradigm, an OWL-DL ontology becomes populated with instances detected through natural language processing (NLP). The generated ontology can be queried by biologists using DL reasoners or integrated into bioinformatics workflows for further automated analyses. We demonstrate the feasibility of this approach with a system targeting the protein mutation literature.

Keywords: text mining; semantic web; ontological NLP; protein mutations; automated reasoning; bioinformatics; OWL-DL ontologies; description logics; protein engineering; biomedical literature; full-text papers; research papers; natural language processing; protein mutation; information retrieval.

DOI: 10.1504/IJBRA.2007.015009

International Journal of Bioinformatics Research and Applications, 2007 Vol.3 No.3, pp.389 - 413

Published online: 04 Sep 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article