Title: Deep learning-based techniques to enhance the precision of phrase-based statistical machine translation system for Indian languages

Authors: J.P. Sanjanasri; M. Anand Kumar; K.P. Soman

Addresses: Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita University, Coimbatore, Amrita Vishwa Vidyapeetham, India ' Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita University, Coimbatore, Amrita Vishwa Vidyapeetham, India ' Centre for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita University, Coimbatore, Amrita Vishwa Vidyapeetham, India

Abstract: The paper focuses on improving the existing phrase-based statistical machine translation (PB-SMT) system by integrating deep learning knowledge to it. In this paper, a deep learning-based PB-SMT system for Indian languages is developed, so as to improve the conditional probability of the phrase-table and replaced the neural probabilistic language model with the existing back off algorithm of n-gram language model to improve the performance of language model. It is shown that the deep feature-based PB-SMT is better than the standard PB-SMT system. It is shown the significance of integrating manually created dictionaries that has been trained as separate translational model can enhance the result of statistical machine translation system when decoding. For automatic evaluation, it is shown that RIBES being a better evaluation metric for Indian languages compared to BLEU, a standard one.

Keywords: Indian languages; PB-SMT; NPLM; deep belief network; DBN; pruning; MERT; BLEU; RIBES; deep learning.

DOI: 10.1504/IJCAET.2020.108106

International Journal of Computer Aided Engineering and Technology, 2020 Vol.13 No.1/2, pp.239 - 257

Received: 20 Sep 2017
Accepted: 20 Nov 2017

Published online: 03 Jul 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article