Title: Biomedical named entity recognition based on recurrent neural networks with different extended methods

Authors: Dingxin Song; Lishuang Li; Liuke Jin; Degen Huang

Addresses: School of Computer Science and Technology, Dalian University of Technology, No. 2, Linggong Road, Hi-Tech Zone, Dalian 116024, China ' School of Computer Science and Technology, Dalian University of Technology, No. 2, Linggong Road, Hi-Tech Zone, Dalian 116024, China ' School of Computer Science and Technology, Dalian University of Technology, No. 2, Linggong Road, Hi-Tech Zone, Dalian 116024, China ' School of Computer Science and Technology, Dalian University of Technology, No. 2, Linggong Road, Hi-Tech Zone, Dalian 116024, China

Abstract: Biomedical Named Entity Recognition (bio-NER) has become essential to the text mining and knowledge discovery tasks in biomedical field. However, the performance of traditional NER systems is limited to the construction of complex hand-designed features which are derived from various linguistic analyses and may only adapted to specified domain. In this paper, we mainly focus on building a simple and efficient system for bio-NER based on Recurrent Neural Network (RNN) where complex hand-designed features are replaced with word embeddings. Furthermore, the system is extended by the predicted information from the prior node and external context information (topical information & clustering information). During the training process, the word embeddings are fine-tuned by the neural network. The experiments conducted on the BioCreative II GM data set demonstrate RNN models outperform CRF model and Deep Neural Networks (DNNs) and the extended RNN model performs better than the original RNN, achieving 82.47% F-score.

Keywords: bio-NER; recurrent neural networks; hand-designed features; word embeddings; context information; biomedical NER; named entity recognition; data mining; text mining; knowledge discovery; bioinformatics.

DOI: 10.1504/IJDMB.2016.079799

International Journal of Data Mining and Bioinformatics, 2016 Vol.16 No.1, pp.17 - 31

Received: 29 Mar 2016
Accepted: 07 May 2016

Published online: 14 Oct 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article