Article: Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets Journal: International Journal of Computing Science and Mathematics (IJCSM) 2016 Vol.7 No.5 pp.410 - 442 Abstract: In machine learning, generation of new algorithms or, in most cases, minor amendment of the existing ones is a common task. In such cases, a rigorous and correct statistical analysis of the results of different algorithms is necessary in order to select the exact technique(s) depending on the problem to be solved. The main inconvenience related to this necessity is the absence of proper compilation of statistical techniques. In this paper, we propose the use of two important non-parametric statistical tests, namely, Wilcoxon signed rank test for comparison of two classifiers and Friedman test with the corresponding post-hoc tests for comparison of multiple classifiers over multiple datasets. We also introduce a new variant of non-parametric test known as Scheffe's test for locating unequal pairs of means of performances of multiple classifiers when the given datasets are of unequal sizes. The parametric tests, which were previously being used for comparing multiple classifiers, have also been described in brief. The proposed non-parametric tests have also been applied on the classification results on ten real-problem datasets taken from the UCI Machine Learning Database Repository (http://www.ics.uci.edu/mlearn) (Valdovinos and Sanchez, 2009) as case studies. Inderscience Publishers - linking academia, business and industry through research

Title: Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets

Authors: Pawan Kumar Singh; Ram Sarkar; Mita Nasipuri

Addresses: Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata-700032, West Bengal, India ' Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata-700032, West Bengal, India ' Department of Computer Science and Engineering, Jadavpur University, 188, Raja S.C. Mullick Road, Kolkata-700032, West Bengal, India

Abstract: In machine learning, generation of new algorithms or, in most cases, minor amendment of the existing ones is a common task. In such cases, a rigorous and correct statistical analysis of the results of different algorithms is necessary in order to select the exact technique(s) depending on the problem to be solved. The main inconvenience related to this necessity is the absence of proper compilation of statistical techniques. In this paper, we propose the use of two important non-parametric statistical tests, namely, Wilcoxon signed rank test for comparison of two classifiers and Friedman test with the corresponding post-hoc tests for comparison of multiple classifiers over multiple datasets. We also introduce a new variant of non-parametric test known as Scheffe's test for locating unequal pairs of means of performances of multiple classifiers when the given datasets are of unequal sizes. The parametric tests, which were previously being used for comparing multiple classifiers, have also been described in brief. The proposed non-parametric tests have also been applied on the classification results on ten real-problem datasets taken from the UCI Machine Learning Database Repository (http://www.ics.uci.edu/mlearn) (Valdovinos and Sanchez, 2009) as case studies.

Keywords: statistical comparison; non-parametric testing; Scheffe test; Wilcoxon-signed rank test; Friedman test; post-hoc test; statistical tests; classifier comparison; multiple datasets; machine learning; multiple classifiers.

DOI: 10.1504/IJCSM.2016.080073

International Journal of Computing Science and Mathematics, 2016 Vol.7 No.5, pp.410 - 442

Received: 20 Nov 2013
Accepted: 20 Aug 2014
Published online: 01 Nov 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets

Keep up-to-date