Title: A new compression method for double-array structures by a hierarchical representation

Authors: Shunsuke Kanda; Masao Fuketa; Kazuhiro Morita; Akio Tomotoshi; Jun-ichi Aoe

Addresses: Department of Information Science and Intelligent Systems, Tokushima University, Minamijosanjima 2-1, Tokushima 770-8506, Japan ' Department of Information Science and Intelligent Systems, Tokushima University, Minamijosanjima 2-1, Tokushima 770-8506, Japan ' Department of Information Science and Intelligent Systems, Tokushima University, Minamijosanjima 2-1, Tokushima 770-8506, Japan ' Department of Information Science and Intelligent Systems, Tokushima University, Minamijosanjima 2-1, Tokushima 770-8506, Japan ' Department of Information Science and Intelligent Systems, Tokushima University, Minamijosanjima 2-1, Tokushima 770-8506, Japan

Abstract: To store and retrieve keyword sets, a trie that is a tree structure is utilised in many applications for processing strings. The double-array and level-order unary degree sequence (LOUDS) efficiently represent the trie. The double-array provides fast retrieval for the trie, but its space usage is not so compact. On the other hand, LOUDS represents the trie compactly, but its retrieval speed is not so fast. This paper presents a new compression method for the double-array. Our new method represents the double-array by a hierarchical structure and changes allocations of the double-array. Theoretical observations show that the new method reduces the space usage of the double-array to ∼60%. Moreover, experimental results for English keywords show that the new method reduces the space usage of the double-array to ∼60-62% without impairing the high-speed performance. The retrieval speed of the new method is ∼17-24 times faster than that of LOUDS.

Keywords: trie; double-array structures; LOUDS; level-order unary degree sequence; data structure; keyword retrieval; compression method; hierarchical representation; string processing; keyword storage; keyword retrieval; tree structures; English keywords.

DOI: 10.1504/IJISTA.2015.074331

International Journal of Intelligent Systems Technologies and Applications, 2015 Vol.14 No.3/4, pp.221 - 236

Received: 11 May 2015
Accepted: 22 Oct 2015

Published online: 22 Jan 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article