Prediction of customer churn risk with advanced machine learning methods Online publication date: Tue, 04-Mar-2025
by Oguzhan Akan; Abhishek Verma; Sonika Sharma
International Journal of Data Science (IJDS), Vol. 10, No. 1, 2025
Abstract: Customer churn risk prediction is an important area of research as it directly impacts the revenue stream of businesses. An ability to predict customer churn allows businesses to come up with better strategies to retain existing customers. In this research we perform a comprehensive comparison of feature selection methods, upsampling methods, and machine learning methods on the customer churn risk dataset: i) Our research compares likelihood-based, tree-based, and layer-based machine learning methods on the churn dataset; ii) Models built on the churn dataset without upsampling performed better than oversampling methods. However, synthetic minority oversampling technique (SMOTE) and adaptive synthetic sampling (ADASYN) helped stabilise model performance; iii) the models built on ADASYN dataset were slightly better than the SMOTE counterparts; iv) it was observed that XGBoost and deep cascading forest (DCF) combined with XGBoost were consistently better across all metrics compared to other methods; and v) information Value analysis performed better than PCA. In particular, IVR DCFX model has the best AUROC score with 89.1%.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Science (IJDS):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com