Title: Biomedical Relationship Extraction from literature based on bio-semantic token subsequences

Authors: Jayasimha R. Katukuri, Ying Xie, Vijay V. Raghavan

Addresses: EBay Inc., San Jose, CA 95125. ' The Department of Computer Science and Information Systems, Kennesaw State University, GA 30144, USA. ' The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA 70503, USA

Abstract: Relationship Extraction (RE) from biomedical literature is an important and challenging problem in both text mining and bioinformatics. Although various approaches have been proposed to extract protein–protein interaction types, their accuracy rates leave a large room for further exploring. In this paper, two supervised learning algorithms based on newly defined ||bio-semantic token subsequence|| are proposed for multi-class biomedical relationship classification. The first approach calculates a ||bio-semantic token subsequence kernel||, whereas the second one explicitly extracts weighted features from bio-semantic token subsequences. The two proposed approaches outperform several alternatives reported in literature on multi-class protein–protein interaction classification.

Keywords: biomedical relationship extraction; bio-semantic token subsequences; bio-semantic token subsequence kernel; discriminative bio-semantic token subsequence classifier; biomedical text mining; functional informatics; bioinformatics; protein–protein interaction; classification.

DOI: 10.1504/IJFIPM.2010.033243

International Journal of Functional Informatics and Personalised Medicine, 2010 Vol.3 No.1, pp.16 - 28

Published online: 14 May 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article