Title: Implementation of optimum binning, ensemble learning and re-sampling techniques to predict student's performance

Authors: Raisul Islam Rashu; Syed Tanveer Jishan; Naheena Haq; Rashedur M. Rahman

Addresses: Department of Electrical and Computer Engineering, North South University, Plot-15, Block-B, Bashundhara, Dhaka 1229, Bangladesh ' Department of Electrical and Computer Engineering, North South University, Plot-15, Block-B, Bashundhara, Dhaka 1229, Bangladesh ' Department of Electrical and Computer Engineering, North South University, Plot-15, Block-B, Bashundhara, Dhaka 1229, Bangladesh ' Department of Electrical and Computer Engineering, North South University, Plot-15, Block-B, Bashundhara, Dhaka 1229, Bangladesh

Abstract: Educational data-mining is an emerging area of research that could extract useful information for the students as well as for the instructors. In this research, we explore data mining techniques that predict students' final grade. We validate our method by conducting experiments on data that are related to grade for courses in North South University, the first private university and one of the leading universities in higher education in Bangladesh. We also extend our ideas through discretisation of the continuous attributes by equal width binning and incorporate it on traditional mining algorithms. However, due to imbalanced nature of data, we got lower accuracy for imbalanced classes. We implement two re-sampling techniques, i.e., ROS (random over sampling), RUS (random under sampling). Experimental results show that re-sampling techniques could overcome the problem of imbalanced dataset in classification significantly and improve the performance of the classification models. Moreover, three ensemble techniques, namely, bagging, boosting (AdaBoost) and random forests have been applied in this research to predict the students' academic performance.

Keywords: educational data mining; EDM; classification; naive Bayes; decision tree; neural networks; discretisation; equal width binning; ensemble learning; resampling; performance prediction; student performance; higher education; Bangladesh; bagging; boosting; random forests; academic performance.

DOI: 10.1504/IJKESDP.2015.073454

International Journal of Knowledge Engineering and Soft Data Paradigms, 2015 Vol.5 No.1, pp.1 - 30

Published online: 09 Dec 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article