Title: Projecting dependency syntax labels from English into Vietnamese in English-Vietnamese bilingual corpus

Authors: Phuoc Tran; Van-Deo Duong; Dien Dinh; Bay Vo; Huu Nguyen; Long H.B. Nguyen

Addresses: NLP-KD Lab, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam ' Faculty of Information Technology, VNU-HCM University of Science, Ho Chi Minh City, Vietnam ' Faculty of Information Technology, VNU-HCM University of Science, Ho Chi Minh City, Vietnam ' Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam ' Faculty of Information Technology, Ho Chi Minh City University of Food Industry, Ho Chi Minh City, Vietnam ' Faculty of Information Technology, VNU-HCM University of Science, Ho Chi Minh City, Vietnam

Abstract: In natural language processing, the corpora play an important role, particularly labelled corpora, such as labelled part-of-speech corpora, labelled component syntax corpora, and labelled dependency syntax corpora. These labelled corpora are used for corpus-based research and give higher quality results than the non-labelled. In this paper, we have conducted a Vietnamese dependency label tagger based on English-Vietnamese bilingual corpus, in which English was tagged with dependency labels. The experimental results show that our method produces a high tagging result with LAS measurement of 73.5% and UAS measurement of 81.7%.

Keywords: natural language processing; projecting dependency syntax; English-Vietnamese bilingual corpus.

DOI: 10.1504/IJIIDS.2020.108212

International Journal of Intelligent Information and Database Systems, 2020 Vol.13 No.1, pp.17 - 32

Received: 09 May 2019
Accepted: 28 Aug 2019

Published online: 06 Jul 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article