Title: Classification system using parallel genetic algorithm

Authors: Bikash Kanti Sarkar; Swapan Kumar Chakraborty

Addresses: Department of Information Technology, Birla Institute of Technology (Deemed University), Mesra, Ranchi-835215, Jharkhand, India. ' Department of Applied Mathematics, Birla Institute of Technology (Deemed University), Mesra, Ranchi-835215, Jharkhand, India

Abstract: Classification task aims at predicting the value of the class attribute of new input data on the basis of a set of pre-classified samples. Traditional machine learning algorithms for classification are usually domain specific or produce unsatisfactory results whenever applied to classification problems with larger size or imbalanced data. Thus, to accumulate genuine useful knowledge for making decision, we introduce here a new intelligent knowledge discovery model, combining C4.5 (a decision tree-based rule inductive algorithm) with a new parallel genetic algorithm (GA) based on the idea of massive parallelism (MP). The model is named as CGAMP (C4.5 and GA based on MP). More specifically, the suggested model receives a base method C4.5 to produce rules which are then refined by the proposed parallel GA to provide more accurate rules. The strength of the developed system has been compared with pure C4.5 and a hybrid system (combining C4.5 and sequential genetic algorithm) on six real world benchmark data sets (collected from University of California at Irvine machine learning repository). The experimental results validate the effectiveness of the new model.

Keywords: classification accuracy; C4.5; parallel genetic algorithms; PGAs; intelligent knowledge discovery; decision trees; rule induction.

DOI: 10.1504/IJICA.2011.044569

International Journal of Innovative Computing and Applications, 2011 Vol.3 No.4, pp.223 - 241

Received: 26 Feb 2011
Accepted: 19 Oct 2011

Published online: 21 Mar 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article