Authors: R. Devi Priya; S. Kuppuswami
Addresses: Kongu Engineering College, Erode 638 052, Tamil Nadu, India ' Kongu Engineering College, Erode 638 052, Tamil Nadu, India
Abstract: Missing data problem degrades the statistical power of any analysis made in clinical studies. To infer valid results from such studies, suitable method is required to replace the missing values. There is no method which can be universally applicable for handling missing values and the main objective of this paper is to introduce a common method applicable in all cases of missing data. In this paper, Bayesian Genetic Algorithm (BGA) is proposed to effectively impute both missing continuous and discrete values using heuristic search algorithm called genetic algorithm and Bayesian rule. BGA is applied to impute missing values in a real cancer dataset under Missing At Random (MAR) and Missing Completely At Random (MCAR) conditions. For both discrete and continuous attributes, the results show better classification accuracy and RMSE% than many existing methods.
Keywords: missing values; BGA; Bayesian genetic algorithms; MAR; missing at random; MCAR; missing completely at random; continuous attributes; discrete attributes; clinical studies; bioinformatics.
International Journal of Bioinformatics Research and Applications, 2014 Vol.10 No.6, pp.613 - 627
Available online: 20 Oct 2014 *Full-text access for editors Access for subscribers Purchase this article Comment on this article