Article: A short text conversation generation model combining BERT and context attention mechanism Journal: International Journal of Computational Science and Engineering (IJCSE) 2020 Vol.23 No.2 pp.136 - 144 Abstract: The standard Seq2Seq neural network model tends to generate general and safe responses (e.g., I don't know) regardless of the input in the field of short text conversation generation. To address this problem, we propose a novel model that combines the standard Seq2Seq model with the BERT module (a pre-trained model) to improve the quality of responses. Specifically, the encoder of the model is divided into two parts: one is the standard Seq2Seq which generates a context attention vector; the other is the improved BERT module which encodes the input sentence into a semantic vector. Then through a fusion unit, the vectors generated by the two parts are fused to generate a new attention vector. Finally, the new attention vector is transmitted to the decoder. In particular, we describe two ways to acquire a new attention vector in the fusion unit. Empirical results from automatic and human evaluations demonstrate that our model improves the quality and diversity of the responses significantly. Inderscience Publishers - linking academia, business and industry through research

Title: A short text conversation generation model combining BERT and context attention mechanism

Authors: Huan Zhao; Jian Lu; Jie Cao

Addresses: School of Information Science and Engineering, Hunan University, ChangSha, China ' School of Information Science and Engineering, Hunan University, ChangSha, China ' School of Information Science and Engineering, Hunan University, ChangSha, China

Abstract: The standard Seq2Seq neural network model tends to generate general and safe responses (e.g., I don't know) regardless of the input in the field of short text conversation generation. To address this problem, we propose a novel model that combines the standard Seq2Seq model with the BERT module (a pre-trained model) to improve the quality of responses. Specifically, the encoder of the model is divided into two parts: one is the standard Seq2Seq which generates a context attention vector; the other is the improved BERT module which encodes the input sentence into a semantic vector. Then through a fusion unit, the vectors generated by the two parts are fused to generate a new attention vector. Finally, the new attention vector is transmitted to the decoder. In particular, we describe two ways to acquire a new attention vector in the fusion unit. Empirical results from automatic and human evaluations demonstrate that our model improves the quality and diversity of the responses significantly.

Keywords: Seq2Seq; short text conversation generation; BERT; attention mechanism; fusion unit.

DOI: 10.1504/IJCSE.2020.110536

International Journal of Computational Science and Engineering, 2020 Vol.23 No.2, pp.136 - 144

Received: 13 Jan 2020
Accepted: 19 Jan 2020
Published online: 23 Oct 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: A short text conversation generation model combining BERT and context attention mechanism

Keep up-to-date