Title: Estimation of success of entrepreneurship projects with data mining

Authors: Selim Corekcioglu; Bekir Polat

Addresses: Doctoral School of Management and Business Administration, Szent István University, Páter Károly u. 1, 2100, Godollo, Hungary ' Graduate School of Natural and Applied Sciences, Gaziantep University, Üniversite Bulvarı, 27310, Şehitkamil Gaziantep, Turkey

Abstract: This study aimed to prevent waste of resource and to estimate the success and failure of proposed entrepreneurship projects with data mining algorithms. Thereby, the accuracy of the estimates increased and decisions about the projects were based on a scientific approach. As a result of the analysis of the data, it has been examined whether entrepreneurial projects were successful or not. The dataset was classified using 10-fold cross-validation with C4.5, Naive Bayes, logistic regression, random forest and support vector algorithms. The results of the classification were compared and the C4.5 algorithm was found as the most successful algorithm with 70.75% prediction accuracy. In consequence of the C4.5 algorithm, the features affecting the tree were found as capital, partner, location, and age, respectively. The features that did not affect the tree were gender, education, market, sector, and personnel.

Keywords: entrepreneurship; SME; small and medium-sized enterprise; data mining; classification; Naive Bayes; logistic regression; random forest; support vector algorithms.

DOI: 10.1504/IJDS.2021.118941

International Journal of Data Science, 2021 Vol.6 No.2, pp.85 - 108

Received: 24 Apr 2020
Accepted: 13 Dec 2020

Published online: 12 Nov 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article