Authors: Doosung Hwang; Youngju Son
Addresses: Department of Software Science, Dankook University, South Korea ' Department of Computer Science, Illinois Central College, USA
Abstract: A prototype-based classification is proposed to select handfuls of class data for learning rules and prediction. A class point is considered as a prototype if it forms a hypersphere that represents a part of class area measured by any distance metric and class label. The prototype selection algorithm, formulated by a set covering optimisation, selects the number of within-class points that is as small as possible, while preserving class covering regions for the unknown data distribution. The upper bound of the error is analysed to compare the effectiveness of the prototype-based classification with the Bayes classifier. Under a bootstrapping strategy and the 0/1 loss, the bias and variance components are driven from a generalisation error without assuming the unknown distribution of a given problem. This analysis provides a way to evaluate prototype-based models and select the optimal model estimate for any standard classifier. The experiments show that the proposed approach is very competitive when compared to the nearest neighbour and the Bayes classifier and efficient in choosing prototypes in terms of class covering regions, data size and computation time.
Keywords: class prototype; set covering optimisation; greedy method; nearest neighbour; error analysis.
International Journal of Data Mining, Modelling and Management, 2018 Vol.10 No.4, pp.293 - 313
Received: 21 Feb 2017
Accepted: 23 Dec 2017
Published online: 04 Sep 2018 *