Title: A data classification method for innovation and entrepreneurship in applied universities based on nearest neighbour criterion

Authors: Xiuhong Qin; Na Li

Addresses: Institute of Innovation and Entrepreneurship, Hebei Women's Vocational College, Shijiazhuang, 050091, China ' Department of Modern Service, Hebei Women's Vocational College, Shijiazhuang, 050091, China

Abstract: Aiming to improve the accuracy, recall, and F1 value of data classification, this paper proposes an applied university innovation and entrepreneurship data classification method based on the nearest neighbour criterion. Firstly, the decision tree algorithm is used to mine innovation and entrepreneurship data from applied universities. Then, dynamic weight is introduced to improve the similarity calculation method based on edit distance, and the improved method is used to realise data de-duplication to avoid data over fitting. Finally, the nearest neighbour criterion method is used to classify applied university innovation and entrepreneurship data, and cosine similarity is used to calculate the similarity between the samples to be classified and each sample in the training data, achieving data classification. The experimental results demonstrate that the proposed method achieves a maximum accuracy of 96.5% and an average F1 score of 0.91. These findings indicate a high level of accuracy, recall, and F1 value for data classification using the proposed method.

Keywords: nearest neighbour criterion; innovation and entrepreneurship; data classification; decision tree algorithm; dynamic weight; cosine similarity.

DOI: 10.1504/IJBIDM.2024.140894

International Journal of Business Intelligence and Data Mining, 2024 Vol.25 No.3/4, pp.382 - 393

Received: 01 Aug 2023
Accepted: 18 Jan 2024

Published online: 03 Sep 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article