Title: Mining frequent itemsets over uncertain data streams

Authors: Huiting Liu; Kaishen Zhou; Peng Zhao; Sheng Yao

Addresses: School of Computer Science and Technology, Anhui University, Hefei 230601, China ' School of Computer Science and Technology, Anhui University, Hefei 230601, China ' School of Computer Science and Technology, Anhui University, Hefei 230601, China ' School of Computer Science and Technology, Anhui University, Hefei 230601, China

Abstract: In recent years, due to the wide applications of sensor network monitoring, RFID, moving object search and LBS, mining frequent itemsets over uncertain data streams has attracted much attention. However, existing hyper-structure-based algorithms cannot achieve high mining accuracy. In this paper, we present two sliding-window-based false-positive-oriented algorithms, called uncertain data stream frequent itemsets mining (UFIM) and UFIMTopK, to find threshold-based and rank-based frequent itemsets from uncertain data streams efficiently. UFIM uses a global GT-tree to maintain frequent itemsets in the sliding window and outputs them when needed. In addition, efficient deleting strategy is designed to reduce time overhead. UFIMTopK is designed to find top-k frequent itemsets, and it is modified from UFIM. Experimental results show that our proposed algorithm UFIM can obtain higher mining accuracy than previous algorithms on synthetic and real-life datasets.

Keywords: frequent itemsets; uncertain data streams; sliding window; threshold-based.

DOI: 10.1504/IJHPCN.2018.093234

International Journal of High Performance Computing and Networking, 2018 Vol.11 No.4, pp.312 - 321

Received: 05 Nov 2015
Accepted: 13 Apr 2016

Published online: 24 Jul 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article