Title: An efficient algorithm for mining sequential generator pattern using prefix trees and hash tables

Authors: Thi-Thiet Pham; Jiawei Luo; Tzung-Pei Hong

Addresses: School of Information Science and Engineering, Hunan University, Changsha City, Hunan Province, 410082, China; Faculty of Information Technology, Industrial University of Ho Chi Minh City, 12 Nguyen Van Bao, Go Vap, Ho Chi Minh City 70559, Vietnam ' School of Information Science and Engineering, Hunan University, Changsha City, Hunan Province, 410082, China ' Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan; Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan

Abstract: Mining long frequent sequences that contain a combinatorial number of frequent subsequences or using very low support thresholds to mine sequential patterns is both time- and memory-consuming. The mining of closed sequential patterns, sequential generator patterns, and maximum sequences has been proposed to overcome this problem. This paper proposes an algorithm for generating all sequential generator patterns. This algorithm uses a vertical approach to listing and counting the support of sequence based on the prime block encoding approach to represent candidate sequences and determine the frequency for each candidate. The search space of the proposed algorithm is much smaller than those of other algorithms because super sequence frequency-based pruning and non-generator-based pruning are applied. Besides, hash tables are also used for fast checking the existed sequential generator patterns. Experimental results conducted on synthetic and real databases show that the proposed algorithm is effective.

Keywords: sequential patterns; sequential generator patterns; prefix trees; hash tables; pattern mining; data mining.

DOI: 10.1504/IJISTA.2014.065151

International Journal of Intelligent Systems Technologies and Applications, 2014 Vol.13 No.3, pp.157 - 169

Received: 31 Dec 2012
Accepted: 04 Apr 2013

Published online: 15 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article