Title: A machine learning algorithm for classification under extremely scarce information

Authors: Lev V. Utkin; Yulia A. Zhuk

Addresses: Department of Industrial Control and Automation, St. Petersburg State Forest Technical Academy, Institutsky per. 5, 194021 St. Petersburg, Russia. ' Department of Computer Science, St. Petersburg State Forest Technical Academy, Institutsky per. 5, 194021 St. Petersburg, Russia

Abstract: When it is difficult to get learning data during the training time, we have to classify objects by having extremely small information about their feature. It is assumed in the paper that only some average or mean value of every feature and the lower and upper bounds of a set of its values are known. The main idea for constructing new classification models taking into account this information is to form a set of probability distributions bounded by some lower and upper probability distribution functions (a p-box). A discriminant function is derived in order to maximise the risk measure over the set of distributions and to minimise it over a set of classification parameters. The algorithm for classification is reduced to a parametric linear programming problem.

Keywords: imprecise probabilities; lower probability distributions; upper probability distributions; learning; risk; Bayesian inference; regression; machine learning algorithm; object classification; scarce information; discriminant functions.

DOI: 10.1504/IJDATS.2012.046788

International Journal of Data Analysis Techniques and Strategies, 2012 Vol.4 No.2, pp.115 - 133

Published online: 06 Sep 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article