Title: Unsupervised word translation for English-Hindi with different retrieval techniques

Authors: Umesh Pant; Shweta Chauhan; Pankaj Pant; Philemon Daniel

Addresses: Department of Electronics and Communication Engineering, National Institute of Technology Hamirpur, Himachal Pradesh, PinCode-177005, India ' Department of Electronics and Communication Engineering, National Institute of Technology Hamirpur, Himachal Pradesh, PinCode-177005, India ' Department of Electronics and Communication Engineering, National Institute of Technology Hamirpur, Himachal Pradesh, PinCode-177005, India ' Department of Electronics and Communication Engineering, National Institute of Technology Hamirpur, Himachal Pradesh, PinCode-177005, India

Abstract: Word translation or incorporation of bilingual dictionaries is an important capability that impacts many multilingual language processing tasks. For translation from one language to another language, we either relied on parallel data or bilingual dictionaries. In this paper, we address this problem and generate best cross-lingual word embedding for English-Hindi language pair. Here, we neither use an aligned document or sentence aligned corpus, nor any bilingual dictionary. We are following the assumption of intra lingual similarity distribution that for the most frequent word the distribution graph is similar between Hindi and English corpus and embeddings are isometric. These cross-lingual words embedding can be used for unsupervised neural machine translation and cross-lingual transfer learning. Different retrieval techniques nearest neighbour, inverted nearest neighbours retrieval, inverted Softmax, and cross-lingual word scaling are performed and compared for the bi-lingual embedding of English-Hindi, which is trained for unsupervised and semi-supervised ways by passing seed dictionary. Bi-lingual word embedding is tested on generated English-Hindi dictionary.

Keywords: machine translation; MT; word embedding; cross-lingual word embedding; nearest neighbour; unsupervised learning.

DOI: 10.1504/IJSI.2022.121095

International Journal of Swarm Intelligence, 2022 Vol.7 No.1, pp.53 - 65

Received: 19 Jun 2020
Accepted: 16 Dec 2020

Published online: 24 Feb 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article