Authors: Kehan Gao; Taghi M. Khoshgoftaar; Huanjing Wang
Addresses: Department of Mathematics and Computer Science, Eastern Connecticut State University, Willimantic, CT 06226, USA. ' Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA. ' Department of Mathematics and Computer Science, Western Kentucky University, Bowling Green, KY 42101, USA
Abstract: The quality of software products can be estimated and improved by building software quality classification models. The predictive accuracy of the software quality classification models is usually affected by two factors: the learning model(s) used in classification and the quality of the data. This study examined both influencing factors, but we concentrated more on the quality of the data by selecting a subset of relevant features for building classification models. We investigated four filter-based feature selection techniques in a case study on a very large telecommunications software system. The empirical results demonstrated that by applying attribute selection we can build classification models with prediction accuracy comparable to or even better than those built with a complete set of attributes, even though the smaller subset of attributes had less than 15% of the complete set of attributes.
Keywords: feature selection; classifiers; performance metrics; search algorithms; software quality; quality classification; modelling; classification models; attribute selection.
International Journal of Information and Decision Sciences, 2012 Vol.4 No.2/3, pp.217 - 250
Available online: 26 May 2012 *Full-text access for editors Access for subscribers Purchase this article Comment on this article