Title: Assessing classification complexity of datasets using fractals

Authors: André Luiz Marasca; Dalcimar Casanova; Marcelo Teixeira

Addresses: Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology – Paraná, Pato Branco, Paraná, Brazil ' Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology – Paraná, Pato Branco, Paraná, Brazil ' Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology – Paraná, Pato Branco, Paraná, Brazil

Abstract: Supervised classification is a mechanism used in machine learning to associate classes with objects from datasets. Depending on the dimension and on the internal data structuring, classification may become complex. In this paper, we claim that the complexity level of a given dataset can be estimated by using fractal analysis. A novel fractal measure, called transition border, is proposed in order to estimate the chaos behind labelled points distribution. Their correlation with the success rate is tested by comparing it against results obtained from other supervised classification methods. Results suggest that this approach can be used to measure the complexity behind a classification task problem in real-valued datasets with three dimensions. The proposed method can also be useful for other science domains for which fractal analysis is applicable.

Keywords: supervised classification; fractal analysis; chaotic datasets; transition border; fractal dimension; complexity.

DOI: 10.1504/IJCSE.2019.103261

International Journal of Computational Science and Engineering, 2019 Vol.20 No.1, pp.102 - 119

Received: 27 Feb 2018
Accepted: 21 May 2018

Published online: 23 Oct 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article