Title: Naxi sentence similarity calculation based on improved chunking edit-distance

Authors: Huihui Zhang; Zhengtao Yu; Longhua Shen; Jianyi Guo; Xudong Hong

Addresses: School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China; Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, Kunming, China ' School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China; Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, Kunming, China ' China Research and Development Academy of Machinery Equipment, Beijing, China ' The School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China; Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, China ' The School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China; Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, China

Abstract: Aiming at the characteristics of Naxi language, a method is proposed for Naxi sentence similarity calculation. First, according to the characteristics of Naxi language that verbs set back, and nouns and verbs appear in chunks. Naxi NP and VP chunks are defined and chunk rule is extracted. According to the rules of the Naxi sentence chunking, extracts NP and VP chunks as so on. Then, by using the Naxi-Chinese dictionary, Naxi word is mapped to the Chinese word. By using the Chinese word similarity, Naxi words semantic similarity is calculated. Similarity of chunks is calculated by the combination of Chinese word similarity. Chunks similarity is defined as the replacement cost of chunk that edits operation, and Naxi sentence similarity is computed according to replacement cost. Finally, experiment is done to calculate Naxi sentence similarity. Experimental result shows that proposed method is better than other methods, and chunk exchange method can effectively improve the accuracy of the Naxi sentence similarity.

Keywords: Naxi language; sentence similarity; sentence chunking; edit distance; semantic similarity; Chinese; chunk exchange.

DOI: 10.1504/IJWMC.2014.058872

International Journal of Wireless and Mobile Computing, 2014 Vol.7 No.1, pp.48 - 53

Available online: 14 Jan 2014 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article