Title: Classification method for online teaching resources by integrating conceptual similarity and random forest
Authors: Jie Zhang; Kexuan Zong
Addresses: Department of Public Courses, Cangzhou Medical College, Cangzhou, Hebei Province, China ' Department of Public Courses, Cangzhou Medical College, Cangzhou, Hebei Province, China
Abstract: Aiming to achieve efficient and accurate classification of online educational resources, a classification method for online teaching resources that integrates concept similarity and random forest is proposed. Firstly, the K-means algorithm is used to reprocess teaching resources and eliminate duplicate resources; secondly, convert the keywords in the deduplicated teaching resources into vectors in high-dimensional space, and evaluate the similarity between concepts through vectors; finally, based on the similarity calculation results, the BERT word embedding model is used to convert teaching resource texts into numerical feature vectors, set the parameters of the random forest, and adjust the parameters of the random forest through cross validation and other methods. The optimised model is used to obtain classification results. The experimental results show that the log loss of the proposed method is less than 0.05, the highest MCC value is 0.80, indicating that the resource classification effect of this method is good.
Keywords: conceptual similarity; random forest; online teaching resources; resource classification; K-means algorithm; BERT model.
DOI: 10.1504/IJCAT.2024.146128
International Journal of Computer Applications in Technology, 2024 Vol.75 No.2/3/4, pp.89 - 95
Received: 30 Aug 2024
Accepted: 02 Jan 2025
Published online: 07 May 2025 *