Support vector machines for credit risk assessment with imbalanced datasets Online publication date: Sun, 24-Jun-2018
by Sihem Khemakhem; Younes Boujelbene
International Journal of Data Mining, Modelling and Management (IJDMMM), Vol. 10, No. 2, 2018
Abstract: Support vector machines (SVM) have a limited performance in credit scoring issues due to the imbalanced data sets in which the number of unpaid is lower than paid loans. In this work, we developed an SVM model with more kernels on a set of imbalanced data and suggested two data resampling alternatives: random over sampling (ROS) and synthetic minority oversampling technique (SMOTE). The aim of this work is to explore the relevance of re-sampling data with the SVM technique for an accurate credit risk prediction rate to the class imbalance constraint. The performance criteria chosen to evaluate the suggested technique were accuracy, sensitivity specificity, error type I, error type II, G-mean and the area under the receiver operating characteristic curve (AUC). Significant empirical results obtained from an experimental study of a real imbalanced database of loans granted by a Tunisian bank demonstrated the performance improvement thanks to sampling strategies in SVM, thus leading to a better prediction accuracy of the creditworthiness of borrowers.
Online publication date: Sun, 24-Jun-2018
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining, Modelling and Management (IJDMMM):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email email@example.com