Authors: Vimal Mishra; R.B. Mishra
Addresses: Department of Computer Engineering, Institute of Technology, Banaras Hindu University, IT-BHU, Varanasi-221005, UP, India. ' Department of Computer Engineering, Institute of Technology, Banaras Hindu University, IT-BHU, Varanasi-221005, UP, India
Abstract: The performance evaluation of machine translation (MT) has proven to be a very difficult problem. In present times, the automatic evaluation methods of MT have become very popular viz. bilingual evaluation understudy (BLEU), unigram precision, unigram recall, F-measure and METEOR score. The BLEU is a metric based on n-gram co-occurrence; precision, recall and F-measure are based on unigram matches; while METEOR score is based on explicit word-to-word matches between the translation and a reference translation (human judgement). In this paper, we evaluate our English to Sanskrit MT (EST) system with these evaluation methods of MT and propose a weighted BLEU, weighted unigram precision, weighted unigram recall, weighted F-measure and weighted METEOR score that is based on assignment of weight to different part-of-speech (POS) in the translation and a reference translation. Our proposed methods of above weighted methods of MT evaluations improve the score of these evaluation methods of MT. The performance results of our EST system, with and without weight of the above methods of MT evaluations, are shown using table.
Keywords: weighted BLEU; bilingual evaluation understudy; weighted unigram precision; weighted unigram recall; weighted F-measure; weighted METEOR score; performance evaluation; English to Sanskrit translation; machine translation.
International Journal of Computer Aided Engineering and Technology, 2012 Vol.4 No.4, pp.340 - 359
Available online: 12 Jul 2012 *Full-text access for editors Access for subscribers Purchase this article Comment on this article