Title: Multi-task transfer learning for biomedical machine reading comprehension

Authors: Wenyang Guo; Yongping Du; Yiliang Zhao; Keyan Ren

Addresses: Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China ' Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China ' Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China ' Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China

Abstract: Biomedical machine reading comprehension aims to extract the answer to the given question from complex biomedical passages, which requires the machine to have the ability to process strong comprehension on natural language. Recent progress has made on this task, but still severely restricted by the insufficient training data due to the domain-specific nature. To solve this problem, we propose a hierarchical question-aware context learning model trained by the multi-task transfer learning algorithm, which can capture the interaction between the question and the passage layer by layer, with multi-level embeddings to strengthen the ability of the language representation. The multi-task transfer learning algorithm leverages the advantages of different machine reading comprehension tasks to improve model generalisation and robustness, pre-training on multiple large-scale open-domain data sets and fine-tuning on the target-domain training set. Moreover, data augmentation is also adopted to create new training samples with various expressions. The public biomedical data set collected from PubMed provided by BioASQ is used to evaluate the model performance. The results show that our method is superior to the best recent solution and achieves a new state of the art.

Keywords: biomedical machine reading comprehension; multi-task learning; transfer learning; attention; data augmentation.

DOI: 10.1504/IJDMB.2020.107878

International Journal of Data Mining and Bioinformatics, 2020 Vol.23 No.3, pp.234 - 250

Received: 28 Mar 2020
Accepted: 02 Apr 2020

Published online: 26 Jun 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article