Title: Space-efficient multiple string matching automata

Authors: Meng Zhang; Tianyu Yang; Rui Wu

Addresses: College of Computer Science and Technology, Jilin University, Changchun, China. ' College of Computer Science and Technology, Jilin University, Changchun, China. ' College of Computer Science and Technology, Jilin University, Changchun, China

Abstract: Aho-Corasick (AC) automaton is a data structure for multiple string matching. We present two compressing methods that enable the AC automaton to work on systems with limited resource such as mobile devices. By the first method, the AC automaton for a pattern set P over an alphabet of size σ needs (σ + 1)I + (1 + log|P| + logM)M + o(M) bits where M and I are the number of states and the number of non-leaf states of the AC automaton respectively, and a state transition takes O(1) time. By the second method, the space is I + (1 + log|P| + logM + log σ)M + o(M log σ) bits, and a state transition takes O(log log σ) time. We then combine the two methods together and archive trade-offs between the space and time complexity.

Keywords: multiple string matching; Aho-Corasick automaton; succinct data structure; mobile devices; string matching automata.

DOI: 10.1504/IJWMC.2012.047983

International Journal of Wireless and Mobile Computing, 2012 Vol.5 No.3, pp.308 - 313

Received: 09 Apr 2012
Accepted: 15 Apr 2012

Published online: 11 Jan 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article