Title: Comparative analysis of regression and machine learning methods for predicting fault proneness models

Authors: Yogesh Singh, Arvinder Kaur, Ruchika Malhotra

Addresses: University School of Information Technology, Guru Gobind Singh Indraprastha University, Kashmere Gate, Delhi 110403, India. ' University School of Information Technology, Guru Gobind Singh Indraprastha University, Kashmere Gate, Delhi 110403, India. ' University School of Information Technology, Guru Gobind Singh Indraprastha University, Kashmere Gate, Delhi 110403, India

Abstract: Demand for quality software has undergone rapid growth during the last few years. This is leading to increase in development of machine learning techniques for exploring datasets which can be used in constructing models for predicting quality attributes such as Decision Tree (DT), Support Vector Machine (SVM) and Artificial Neural Network (ANN). This paper examines and compares Logistic Regression (LR), ANN (model predicted in an analogous study using the same dataset), SVM and DT methods. These two methods are explored empirically to find the effect of object-oriented metrics given by Chidamber and Kemerer on the fault proneness of object-oriented system classes. Data collected from Java applications is used in the study. The performance of the methods was compared by Receiver Operating Characteristic (ROC) analysis. DT modelling showed 84.7% of correct classifications of faulty classes and is a better model than the model predicted using LR, SVM and ANN method. The area under the ROC curve of LR, ANN, SVM and DT model is 0.826, 0.85, 0.85 and 0.87, respectively. The paper shows that machine learning methods are useful in constructing software quality models.

Keywords: software quality models; metrics; logistic regression; receiver operating characteristics curve; decision trees; support vector machine; SVM; machine learning; artificial neural networks; ANNs; object-oriented systems; fault-prone software.

DOI: 10.1504/IJCAT.2009.026595

International Journal of Computer Applications in Technology, 2009 Vol.35 No.2/3/4, pp.183 - 193

Published online: 20 Jun 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article