Title: A model of mining approximate frequent itemsets using rough set theory

Authors: Xiaomei Yu; Jun Zhao; Hong Wang; Xiangwei Zheng; Xiaoyan Yan

Addresses: Institute of Information and Engineer, Shandong Normal University, China; Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, China; Institute of Life Sciences, Shandong Normal University, Jinan 250014, Shandong, China ' Institute of Information and Engineer, Shandong Normal University, China; Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, China; Institute of Life Sciences, Shandong Normal University, Jinan 250014, Shandong, China ' Institute of Information and Engineer, Shandong Normal University, China; Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, China; Institute of Life Sciences, Shandong Normal University, Jinan 250014, Shandong, China ' Institute of Information and Engineer, Shandong Normal University, China; Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, China; Institute of Life Sciences, Shandong Normal University, Jinan 250014, Shandong, China ' Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250011, Shandong, China

Abstract: Datasets can be described by decision tables. In real-life applications, data are usually incomplete and uncertain, which presents big challenges for mining frequent itemsets in imprecise databases. This paper presents a novel model of mining approximate frequent itemsets using the theory of rough sets. With a transactional information system constructed on the dataset under consideration, a transactional decision table is put forward, then lower and upper approximations of support are available which can be easily computed from the indiscernibility relations. Finally, by a divide-and-conquer way, the approximate frequent itemsets are discovered taking consideration of support-based accuracy and coverage defined. The evaluation of the novel model is conducted on both synthetic datasets and real-life applications. The experimental results demonstrate its usability and validity.

Keywords: rough set theory; RST; data mining; decision table; approximate frequent itemsets; AFIs; indiscernibility relation.

DOI: 10.1504/IJCSE.2019.099640

International Journal of Computational Science and Engineering, 2019 Vol.19 No.1, pp.71 - 82

Received: 09 May 2016
Accepted: 25 Sep 2016

Published online: 02 May 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article