Title: An ensemble of grapheme and phoneme-based models for automatic English to Kannada back-transliteration
Authors: B.S. Sowmya Lakshmi; B.R. Shambhavi
Addresses: Department of Information Science and Engineering, BMS College of Engineering, Bangalore, India; Affiliated to: Visvesvaraya Technological University, Belagavi, Karnataka, India ' Department of Information Science and Engineering, BMS College of Engineering, Bangalore, India; Affiliated to: Visvesvaraya Technological University, Belagavi, Karnataka, India
Abstract: The task of mapping graphemes or phonemes of one language into phoneme approximations of another language is known as machine transliteration. In this paper, three machine transliteration approaches, 'grapheme-based model', 'phoneme-based model' and 'hybrid model' have been proposed to achieve back transliteration of Romanised Kannada word to its native script Kannada, a resource poor language. A bilingual corpus of around 3 lakh words is built, which comprises of pairs of Romanised Kannada word with its corresponding word in Kannada script. The paradigms are assessed with 3,000 Romanised Kannada test words. Hybrid model achieved better accuracy of 85.93% when compared with other two models.
Keywords: transliteration; bilingual corpus; rule-based approach; statistical approach.
DOI: 10.1504/IJISC.2021.113314
International Journal of Intelligence and Sustainable Computing, 2021 Vol.1 No.2, pp.138 - 150
Received: 04 Dec 2018
Accepted: 17 Jan 2019
Published online: 26 Feb 2021 *