Authors: Ricardo Sousa; Fatima Rodrigues
Addresses: Instituto Superior de Engenharia do Porto, Porto, Portugal ' GECAD – Knowledge Engineering and Decision Support Group, Institute of Engineering – Polytechnic of Porto, Porto, Portugal
Abstract: Many existing association rule algorithms are based on the support-based pruning strategy to prune the combinatorial search space. This strategy is not effective for discovering interesting patterns, because low values of support generates too many rules, involving items with different support levels and poorly correlated, and high levels of support generates very few rules and generally trivial ones. In this paper we describe an algorithm to mining association rules with both rare and frequent items - MIRF algorithm. This algorithm does not require the minimum support to be specified in advance. Rather, it generates in each iteration all possible item sets, and extracts only those positively correlated in order to obtain a rule set whose size is smaller, easier to interpret and with both frequent and rare items. Also only the most relevant rule is extracted from each item set, which significantly reduces the time required for the mining process and the number of rules generated. Experimental evaluation of our algorithm on several databases will be presented.
Keywords: data mining; association rules mining; frequent items; rare items.
International Journal of Knowledge Engineering and Data Mining, 2013 Vol.2 No.4, pp.237 - 247
Available online: 17 Feb 2014 *Full-text access for editors Access for subscribers Purchase this article Comment on this article