Open Access Article

Title: Influence of sampling methods on bankruptcy prediction: normal vs. abnormal economic conditions

Authors: Asif M. Huq; Wonder Mahembe

Addresses: Faculty of Education and Business Studies, Department of Business and Economic Studies, University of Gävle, 801-76, Gävle, Sweden; School of Information and Engineering, Department of Computing, Dalarna University, Falun, Sweden ' School of Information and Engineering, Department of Computing, Dalarna University, Falun, Sweden

Abstract: Bankruptcy prediction research has largely emphasised model performance through feature selection and algorithm optimisation, while the equally important challenge of class imbalance remains underexplored. Most studies also focus on publicly listed firms, reflecting the accessibility of standardised data. Our study makes a novel and valuable contribution by leveraging a large-scale dataset of private firms - an economically significant yet understudied segment. Using 2,039,222 firm-year observations from 430,800 private firms between 2012 and 2021, we evaluate four machine learning models, five sampling techniques, and two distinct economic periods. Results show that sampling choice strongly influences accuracy and feature relevance, depending on macroeconomic conditions. Importantly, simple interpretable models built on theoretically grounded features (e.g., Altman, 1968) achieve robust predictions, challenging prevailing reliance on complex methods, while Extreme Gradient Boosting (XGBoost) consistently outperforms alternatives. By focusing on private firms, the study provides unique insights and underscores methodological choices crucial for reliable bankruptcy prediction.

Keywords: bankruptcy prediction; data imbalance; machine learning; sampling methods.

DOI: 10.1504/IJBAAF.2025.149819

International Journal of Banking, Accounting and Finance, 2025 Vol.15 No.5, pp.1 - 32

Received: 20 Jul 2024
Accepted: 25 Sep 2025

Published online: 13 Nov 2025 *