Authors: Anuradha Yarlagadda; J.V.R. Murthy; M.H.M. Krishna Prasad
Addresses: Department of CSE, JNTUH, Hyderabad, India ' Department of CSE, UCEK, JNTUK, Kakinada, India ' Department of CSE, UCEK, JNTUK, Kakinada, India
Abstract: A linear regression mathematical model for data classification is studied in this paper. Least square estimation and differential evolution techniques are used separately to compute the coefficients of the regressive model. Experimentations are carried on fourteen datasets comprising of pure numerical, pure categorical and mixed features values in which a simple integer coding is used to convert categorical features values to numerical feature values. Irrelevant and redundant features of three high dimensional data are removed using fast-clustering based feature subset selection technique and then efficient mathematical models are developed. All the results are averaged over thirty simulation runs and the average percentage of correct classifications are reported. From the result analysis it is clearly observed that differential evolution based approach, performs better in terms of providing correct classifiers. The feature selection algorithm removes many unnecessary features of the datasets and reduces the computational cost of building the mathematical models.
Keywords: least squares estimation; mathematical modelling; linear regression; differential evolution; linear mathematical classifiers; optimisation; data classification; simulation.
International Journal of Convergence Computing, 2015 Vol.1 No.3/4, pp.308 - 322
Received: 18 May 2014
Accepted: 24 Jun 2014
Published online: 22 Apr 2016 *