Authors: Sotiris Kotsiantis
Addresses: Educational Software Development Laboratory, Department of Mathematics, P.A. Box: 1399, University of Patras, Patras 26500, Greece
Abstract: Student dropout occurs quite often in universities providing distance education and the dropout rates are definitely higher than those in conventional universities. Limiting dropout is essential in university-level distance learning and therefore the ability to predict students| dropout could be useful in a great number of different ways. Generally, data sets from this domain exhibit skewed class distributions in which most cases are allotted to the normal class (students that continue their studies) and fewer cases to the dropout class, the most interesting class. A classifier induced from an imbalanced data set has, typically, a low error rate for the majority class and an unacceptable error rate for the minority class. This paper firstly provides a systematic study on the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a proposed local cost sensitive technique and it concludes that such a framework can be a more effective solution to the problem.
Keywords: student dropouts; student modelling; distance education; Hellenic Open University; HOU; Greece; educational data mining; dropout prediction; higher education; university dropouts; distance learning students.
International Journal of Knowledge Engineering and Soft Data Paradigms, 2009 Vol.1 No.2, pp.101 - 111
Available online: 25 Jan 2009 *Full-text access for editors Access for subscribers Purchase this article Comment on this article