Title: Efficient dictionary matching by Aho-Corasick automata of truncated patterns

Authors: Meng Zhang; Jiashu Fan; Dequan Chen

Addresses: College of Computer Science and Technology, Jilin University, Changchun, China ' College of Computer Science and Technology, Jilin University, Changchun, China ' College of Computer Science and Technology, Jilin University, Changchun, China

Abstract: We present a space-efficient data structure for dictionary matching. We truncate patterns to truncated patterns where symbols are ℓ-length substrings of the pattern. By employing the AC automaton of truncated patterns and that of ℓ-length substrings, we simulate the AC automaton of the original pattern set. The new structure is space economical as we apply the prefix merging to substrings of patterns. Using this structure, the dictionary matching runs in O(n log k + tocc log k + occ) time where n is the length of the text, k the number of patterns, occ the number of occurrences of patterns in the text, and tocc the number of occurrences of strings that are longest prefix of each pattern with length of a multiple of ℓ.

Keywords: dictionary matching; Aho-Corasick automaton; truncated patterns.

DOI: 10.1504/IJCSM.2016.078738

International Journal of Computing Science and Mathematics, 2016 Vol.7 No.4, pp.323 - 329

Received: 27 Apr 2016
Accepted: 06 May 2016

Published online: 01 Sep 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article