Int. J. of Information Technology and Management   »   2015 Vol.14, No.2/3



Title: A novel approach for imputation of missing continuous attribute values in databases using genetic algorithm


Authors: R. Devi Priya; S. Kuppuswami


Kongu Engineering College, Erode-638 052, Tamil Nadu, India
Kongu Engineering College, Erode-638 052, Tamil Nadu, India


Abstract: Missing values in databases are more common and if untreated distort the estimates. Numerous methods were developed by researchers to replace the missing values in continuous attributes. The simple methods used are less efficient and the efficient methods are very complex to implement. Hence, to maintain a balance between simplicity and efficiency a new method called Bayesian genetic algorithm (BGA) is proposed based on genetic algorithm and Bayes theorem for both missing at random (MAR) and missing completely at random (MCAR) assumption. Accuracy of BGA is compared with that of mean, kNN and multiple imputation in finding the missing values and the results are studied. BGA produces more accurate results than other methods in four datasets studied at different rates of missingness ranging from 5% to 60%. BGA works better even in large datasets resulting in less biased estimates.


Keywords: continuous attributes; missing values; Bayesian genetic algorithms; BGA; missing at random; MAR; missing completely at random; MCAR; databases.


DOI: 10.1504/IJITM.2015.068461


Int. J. of Information Technology and Management, 2015 Vol.14, No.2/3, pp.185 - 200


Available online: 18 Mar 2015



Editors Full text accessPurchase this articleComment on this article