Title: A semi-supervised, weighted pattern-learning approach for extraction of gene regulation relationships from scientific literature

Authors: Yi-Tsung Tang; Hung-Yu Kao; Shaw-Jenq Tsai; Hei-Chia Wang

Addresses: Department of Computer Science and Information Engineering, National Cheng Kung University, 701, Taiwan ' Department of Computer Science and Information Engineering, National Cheng Kung University, 701, Taiwan ' Department of Physiology, College of Medicine, National Cheng Kung University, 701, Taiwan ' Institute of Information Management, National Cheng Kung University, 701, Taiwan

Abstract: Moreover, the large amount of textual knowledge in the existing biomedical literature is growing rapidly, and the creation of manual patterns from the available literature is becoming more difficult. There is an increasing demand to extract potential generic regulatory relationships from unlabelled data sets. In this paper, we describe a Semi-Supervised, Weighted Pattern Learning method (SSWPL) to extract such generic regulatory information from the literature. SSWPL can build new regulatory patterns according to predefined initial patterns from unlabelled data in the literature. These constructed regulatory patterns are then used to extract generic regulatory information from PubMed abstracts. The results presented herein demonstrate that our method can be utilised to effectively extract generic regulatory relationships from the literature by using learned, weighted patterns through semi-supervised pattern learning.

Keywords: text mining; gene regulation relationships; semi-supervised learning; pattern learning; biomedical literature; bioinformatics; unlabelled data sets; PubMed.

DOI: 10.1504/IJDMB.2014.062147

International Journal of Data Mining and Bioinformatics, 2014 Vol.9 No.4, pp.401 - 416

Accepted: 29 Apr 2011
Published online: 21 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article