Title: Data analytics for gross domestic product using random forest and extreme gradient boosting approaches: an empirical study

Authors: Elsayed A.H. Elamir

Addresses: Management and Marketing Department, College of Business Administration, University of Bahrain, Kingdom of Bahrain

Abstract: This study aims to use the random forest and extreme gradient boosting approaches to forecast and analyse gross domestic product per capita using data from World Bank development indicators on countries level over the period 2010 to 2017. The comprehensive comparisons are executed using years before 2017 as training data and year 2017 as testing data. The root mean squares error, and the coefficient of determination are used to judge among the different models. The random forest and extreme gradient boosting achieve accuracy 97.8% and 98.1%, respectively, using coefficient of determination. The results suggest that the investment in education, labour, health, and industry as well as decreasing in inflation, interest, unemployment is necessary to enhance gross domestic product per capita. Motivating results are given by two-way interaction measure that is useful in explaining co-dependencies in the model behaviour. The strongest interactions are between trade-technology, technology-education followed by consumption-health.

Keywords: bagging; boosting; business analytics; forecast; gross domestic product; GDP; machine learning.

DOI: 10.1504/IJDMMM.2022.125258

International Journal of Data Mining, Modelling and Management, 2022 Vol.14 No.3, pp.269 - 286

Received: 22 Apr 2020
Accepted: 18 Jan 2021

Published online: 05 Sep 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article