Authors: Soma Chatterjee; Kamal Sarkar
Addresses: Computer Science and Engineering Department, Jadavpur University, Kolkata – 700 032, India ' Computer Science and Engineering Department, Jadavpur University, Kolkata – 700 032, India
Abstract: Name transliteration plays an important role in developing automatic machine translation and cross lingual information retrieval system because these systems cannot directly translate out-of-vocabulary (OOV) words. In this article, a SVM-based name transliteration approach has been presented. This approach considers transliteration task as a multi-class problem of pattern classification, where the input is a source transliteration unit (chunks of source grapheme) and the classes are the distinct transliteration units (chunks of target grapheme) in the target language. A study on using hidden Markov model (HMM) for solving machine transliteration problem viewed as a sequence learning problem has also been presented in this paper. Bengali-to-English forward and backward name transliteration have been considered in this study. Our proposed methods have been compared with some existing transliteration method that uses a modified version of joint-source channel model. After the systems have been evaluated, the obtained results show that our proposed SVM-based model gives the best results among the others. Our experiments also reveal that the performance of HMM-based system is comparable with the SVM-based system.
Keywords: name transliteration; support vector machines; hidden Markov model; HMM; modified joint-source channel model; machine transliteration; machine translation.
International Journal of Advanced Intelligence Paradigms, 2021 Vol.19 No.1, pp.3 - 27
Received: 23 Aug 2017
Accepted: 20 Dec 2017
Published online: 28 Apr 2021 *