Title: Word sense-based approach for Hindi to Tamil machine translation using English as pivot language

Authors: K. Vimal Kumar; Divakar Yadav

Addresses: Department of Computer Science Engineering, Jaypee Institute of Information Technology, Noida-201 307, Uttar Pradesh, India ' Department of Computer Science Engineering, Jaypee Institute of Information Technology, Noida-201 307, Uttar Pradesh, India

Abstract: Machine translation is defined as the translation of source text to a desired target text. As there is resource availability in different languages in the internet world, there is need to share the knowledge to a different set of audience who knows only their native language. The proposed system is aimed to build a word sense-based statistical machine translation system (Hindi to Tamil). Since there is a lack of resources in these languages, there is need of some other intermediate pivot language which has high resource availability and English language is chosen as one. Initially, the Hindi text is subjected to pre-processing phase where the text is morphologically and syntactically analysed. After analysis, the senses of the words are identified using latent semantic analysis (LSA) in order to provide a meaningful translation. Once these analyses are done, the sentence is subjected to statistical translation from source to target language through the intermediate pivot language. This system has an improved efficiency when compared with the system that does not have sense identification and pivot language.

Keywords: statistical machine translation; word sense disambiguation; latent semantic analysis; LSA; pivot-based machine translation.

DOI: 10.1504/IJAIP.2018.095468

International Journal of Advanced Intelligence Paradigms, 2018 Vol.11 No.3/4, pp.284 - 298

Received: 14 Oct 2015
Accepted: 26 Feb 2016

Published online: 08 Oct 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article