Article: An empirical study of statistical language models: n-gram language models vs. neural network language models Journal: International Journal of Innovative Computing and Applications (IJICA) 2018 Vol.9 No.4 pp.189 - 202 Abstract: Statistical language models are an important module in many areas of successful applications such as speech recognition and machine translation. And n-gram models are basically the state-of-the-art. However, due to sparsity of data, the modelled language cannot be completely represented in the n-gram language model. In fact, if new words appear in the recognition or translation steps, we need to provide a smoothing method to distribute the model probabilities over the unknown values. Recently, neural networks were used to model language based on the idea of projecting words onto a continuous space and performing the probability estimation in this space. In this experimental work, we compare the behaviour of the most popular smoothing methods with statistical n-gram language models and neural network language models in different situations and with different parameters. The language models are trained on two corpora of French and English texts. Good empirical results are obtained by the recurrent neural network language models. Inderscience Publishers - linking academia, business and industry through research

Title: An empirical study of statistical language models: n-gram language models vs. neural network language models

Authors: Freha Mezzoudj; Abdelkader Benyettou

Addresses: Université des Sciences et de la Technologie d'Oran Mohamed Boudiaf, BP 1505 El Mnaouer, 31000, Oran, Algeria; Université Hassiba Benbouali Chlef, Ouled Fares, 02000, Chlef, Algeria ' Signal Image et Parole (SIMPA) Laboratory, Université des Sciences et de la Technologie d'Oran Mohamed Boudiaf, BP 1505 El Mnaouer, 31000, Oran, Algeria

Abstract: Statistical language models are an important module in many areas of successful applications such as speech recognition and machine translation. And n-gram models are basically the state-of-the-art. However, due to sparsity of data, the modelled language cannot be completely represented in the n-gram language model. In fact, if new words appear in the recognition or translation steps, we need to provide a smoothing method to distribute the model probabilities over the unknown values. Recently, neural networks were used to model language based on the idea of projecting words onto a continuous space and performing the probability estimation in this space. In this experimental work, we compare the behaviour of the most popular smoothing methods with statistical n-gram language models and neural network language models in different situations and with different parameters. The language models are trained on two corpora of French and English texts. Good empirical results are obtained by the recurrent neural network language models.

Keywords: language models; n-grams; Kneser-Ney smoothing; modified Kneser-Ney smoothing; Good-Turing smoothing; interpolation; back-off; feed-forward neural networks; continuous space language models; CSLM; recurrent neural networks; RNN; speech recognition; machine translation.

DOI: 10.1504/IJICA.2018.095762

International Journal of Innovative Computing and Applications, 2018 Vol.9 No.4, pp.189 - 202

Received: 22 Oct 2016
Accepted: 11 Jul 2017
Published online: 22 Oct 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: An empirical study of statistical language models: n-gram language models vs. neural network language models

Keep up-to-date