Int. J. of Information Technology and Management   »   2015 Vol.14, No.2/3

 

 

Title: A novel approach for imputation of missing continuous attribute values in databases using genetic algorithm

 

Authors: R. Devi Priya; S. Kuppuswami

 

Addresses:
Kongu Engineering College, Erode-638 052, Tamil Nadu, India
Kongu Engineering College, Erode-638 052, Tamil Nadu, India

 

Abstract: Missing values in databases are more common and if untreated distort the estimates. Numerous methods were developed by researchers to replace the missing values in continuous attributes. The simple methods used are less efficient and the efficient methods are very complex to implement. Hence, to maintain a balance between simplicity and efficiency a new method called Bayesian genetic algorithm (BGA) is proposed based on genetic algorithm and Bayes theorem for both missing at random (MAR) and missing completely at random (MCAR) assumption. Accuracy of BGA is compared with that of mean, kNN and multiple imputation in finding the missing values and the results are studied. BGA produces more accurate results than other methods in four datasets studied at different rates of missingness ranging from 5% to 60%. BGA works better even in large datasets resulting in less biased estimates.

 

Keywords: continuous attributes; missing values; Bayesian genetic algorithms; BGA; missing at random; MAR; missing completely at random; MCAR; databases.

 

DOI: 10.1504/IJITM.2015.068461

 

Int. J. of Information Technology and Management, 2015 Vol.14, No.2/3, pp.185 - 200

 

Available online: 18 Mar 2015

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article