Title: Double array structures based on byte segmentation for n-gram
Authors: Masao Fuketa; Kazuhiro Morita; Jun-ichi Aoe
Addresses: Department of Information Science and Intelligent Systems, University of Tokushima, Minami Josanjima Tokushima-shi, Tokushima, 770-8506, Japan ' Department of Information Science and Intelligent Systems, University of Tokushima, Minami Josanjima Tokushima-shi, Tokushima, 770-8506, Japan ' Department of Information Science and Intelligent Systems, University of Tokushima, Minami Josanjima Tokushima-shi, Tokushima, 770-8506, Japan
Abstract: Retrieving keywords is used in many applications, and the most demands for retrieving keywords are speed and compactness. The double array is one of the retrieval methods, and combines fast retrieval speed with compactness. In previous researches for the double array, one-by-one-byte character in the key traversed in the trie. Therefore, this paper proposes retrieval algorithms and data structures for the double array method traversed every n byte. Moreover, the proposed method is applied to the compact double array which is the compression method of the double array. From experimental observations, the size of the proposed method becomes 41% compared with that of the original double array and 63% compared with that of the compact double array. The retrieval speed of the proposed method becomes 1.2 times faster than the original double array.
Keywords: trie; n-gram; compression; keyword retrieval; double array structures; byte segmentation; retrieval algorithms; data structures.
DOI: 10.1504/IJCAT.2015.071971
International Journal of Computer Applications in Technology, 2015 Vol.52 No.2/3, pp.110 - 116
Published online: 26 Sep 2015 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article