Authors: Huan Zhao; Jian Lu; Jie Cao
Addresses: School of Information Science and Engineering, Hunan University, ChangSha, China ' School of Information Science and Engineering, Hunan University, ChangSha, China ' School of Information Science and Engineering, Hunan University, ChangSha, China
Abstract: The standard Seq2Seq neural network model tends to generate general and safe responses (e.g., I don't know) regardless of the input in the field of short text conversation generation. To address this problem, we propose a novel model that combines the standard Seq2Seq model with the BERT module (a pre-trained model) to improve the quality of responses. Specifically, the encoder of the model is divided into two parts: one is the standard Seq2Seq which generates a context attention vector; the other is the improved BERT module which encodes the input sentence into a semantic vector. Then through a fusion unit, the vectors generated by the two parts are fused to generate a new attention vector. Finally, the new attention vector is transmitted to the decoder. In particular, we describe two ways to acquire a new attention vector in the fusion unit. Empirical results from automatic and human evaluations demonstrate that our model improves the quality and diversity of the responses significantly.
Keywords: Seq2Seq; short text conversation generation; BERT; attention mechanism; fusion unit.
International Journal of Computational Science and Engineering, 2020 Vol.23 No.2, pp.136 - 144
Received: 13 Jan 2020
Accepted: 19 Jan 2020
Published online: 23 Oct 2020 *