Handling imbalanced data sets with a modification of Decorate algorithm Online publication date: Wed, 10-Dec-2008
by Sotiris B. Kotsiantis
International Journal of Computer Applications in Technology (IJCAT), Vol. 33, No. 2/3, 2008
Abstract: Many real-world data sets exhibit skewed class distributions in which almost all instances are allotted to a class and far fewer instances to a smaller, but usually more interesting class. A classifier induced from an imbalanced data set has, characteristically, a low error rate for the majority class and an undesirable error rate for the minority class. This paper firstly provides a systematic study on the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a modification of Decorate algorithm and it concludes that such a framework can be a more valuable solution to the problem. Our method seems to permit improved identification of difficult small classes in predictive analysis, while keeping the classification ability of the majority class in an acceptable level.
Online publication date: Wed, 10-Dec-2008
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Computer Applications in Technology (IJCAT):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email email@example.com