Title: Detecting data inconsistencies by multiple target rules

Authors: Kalaivany Natarajan; Jiuyong Li; Andy Koronios

Addresses: School of Computer and Information Science, University of South Australia, Mawson Lakes, South Australia 5095, Australia. ' School of Computer and Information Science, University of South Australia, Mawson Lakes, South Australia 5095, Australia. ' School of Computer and Information Science, University of South Australia, Mawson Lakes, South Australia 5095, Australia

Abstract: Data quality problems are common in large databases. One main data quality problem is data inconsistencies. Data mining techniques can be used to predict inconsistent values. One of the main techniques is association rule mining. Association rules identify relationships between attribute values and can be used to find out inconsistent values. In this paper, we use multiple target rules to identify inconsistent values. Multiple target rules are an extension of association rules and use a set of disjunctive attribute values as consequences. Traditional association rules predict inconsistent values by single or multiple conjunctive RHS rules. The coverage of traditional association rules is limited because of the high confidence requirement. We propose to extend RHS to multiple disjunctive rules. The coverage of multiple disjunctive rules has been extended. Prediction power of multiple disjunctive rules is higher than the traditional association rules.

Keywords: data cleaning; data mining; association rules; multiple target rules; data inconsistency; data quality; rule mining.

DOI: 10.1504/IJBSR.2012.047928

International Journal of Business and Systems Research, 2012 Vol.6 No.3, pp.296 - 312

Published online: 14 Nov 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article