Title: Problems and systematic solutions in data quality
Authors: Xingsen Li, Lingling Zhang, Peng Zhang, Yong Shi
Addresses: Management School, Ningbo Institute of Technology, Zhejiang University, Ningbo, 315100, P.R. China. ' School of Management, Graduate University of the Chinese Academy of Sciences, Beijing, 100190, P.R. China; Research Centre on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, 100190, P.R. China. ' School of Information Science and Engineering, Graduate University of the Chinese Academy of Sciences, Beijing, 100190, P.R. China; Research Centre on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, 100190, P.R. China. ' Research Centre on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, 100190, P.R. China; College of Information Science and Technology, University of Nebraska at Omaha, Omaha NE 68182, USA
Abstract: Data are important for making decisions. However, the quality of the data affects the quality of decisions. Data mining as one of the most important sources of knowledge needs high quality data to mine, but data of sufficient quality is often lacking. By systematically analysing the reasons causing low data quality in data mining, we found that general methods on improving data quality by data cleaning are not enough. A new method for improving data quality called data mining consulting has been established. It defines data quality in a wider range from the customer view of data mining, finds potential data quality problems at an earlier stage and solves the data quality problem by a series methods including software techniques, data mining principles and management rules such as ISO 9000. Its application in a web company shows that it is practical and can increase data quality from the very beginning.
Keywords: data quality; data mining consulting; information systems; ISO 9000; quality management; quality standards.
DOI: 10.1504/IJSSCI.2009.021961
International Journal of Services Sciences, 2009 Vol.2 No.1, pp.53 - 69
Published online: 11 Dec 2008 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article