Authors: R. Sivaraj; R. Devi Priya
Addresses: Department of Computer Science and Engineering, Velalar College of Engineering and Technology, Erode, India ' Department of Information Technology, Kongu Engineering College, Erode, India
Abstract: Nowadays most of the real databases are often big and need preprocessing before making any analysis from them. Data analytics will be valid only when the databases are complete. But, values of attributes in the databases may be missing due to several reasons and most of the methods developed were intended to deal with missing values in homogeneous attributes which may be either discrete or continuous. Only few literatures have dealt with missing values in heterogeneous attributes. Ant colony optimisation (ACO) is an evolutionary algorithm which works based on behaviour of ants commonly used to solve combinatorial problems. For estimating missing heterogeneous (discrete and continuous) attribute values, the paper introduces a new algorithm called Bayesian based Parallel Max-Min Ant System (BPMMAS) by hybridising Bayesian principles with Max-Min Ant System (MMAS), an enhanced version of ACO. The algorithm is executed in parallel with subsets of the original dataset and the imputation results are combined. MMAS is chosen since it works better for solving combinatorial problems. Bayesian principles reflect well the influence of covariates for estimation of missing values. BPMMAS is implemented in real datasets and it is observed that its estimation accuracy is better than that of other missing data handling methods with less computational time.
Keywords: missing values; max-min ant system; MMAS; Bayesian principles; large databases; ant colony optimisation; ACO; metaheuristics; swarm intelligence; missing data; combinatorial optimisation.
International Journal of Bio-Inspired Computation, 2017 Vol.9 No.2, pp.114 - 120
Available online: 21 Mar 2017 *Full-text access for editors Access for subscribers Purchase this article Comment on this article