Authors: Joseph Fong, Shi-Ming Huang, Hsiang-Yuan Hsueh
Addresses: Computer Science Department, City University of Hong Kong, Hong Kong. ' Accounting and Information Technology Department, National Chung Cheng University, Chia-Yi, Taiwan. ' Information Management Department, National Chung Cheng University, Chia-Yi, Taiwan
Abstract: In data mining, target data selection is important. The symptom of ||garbage in and garbage out|| is avoided to derive effective business rules in knowledge discovery in database. Chi-Square test is useful to eliminate irrelevant data before data mining processing due to wrong degrees of freedom, untested hypothesis, inconsistent estimation, inefficient method, data redundancy, data overdue, and data heterogeneity. This paper offers an online analytical processing method to derive association rules for the filtered Chi-Square tested data. The process applies a Frame metadata to trigger the Chi-Square testing for the update of the source data, and to derive rules continuously.
Keywords: chi-squared test; online analytical processing; OLAP; association rules; frame metadata; data mining; target data selection; business rules; knowledge discovery; irrelevant data.
International Journal of Business Intelligence and Data Mining, 2007 Vol.2 No.3, pp.311 - 327
Available online: 19 Oct 2007 *Full-text access for editors Access for subscribers Purchase this article Comment on this article