Title: Depth-based support vector classifiers to detect data nests of rare events
Authors: Rainer Dyckerhoff; Hartmut Jakob Stenz
Addresses: Institute of Econometrics and Statistics, University of Cologne, Germany ' Institute of Econometrics and Statistics, University of Cologne, Germany
Abstract: The aim of this project is to combine data depth with support vector machines (SVM) for binary classification. To this end, we introduce data depth functions and SVM and discuss why a combination of the two is assumed to work better in some cases than using SVM alone. For two classes X and Y , one investigates whether an individual data point should be assigned to one of these classes. In this context, our focus lies on the detection of rare events, which are structured in data nests: class X contains much more data points than class Y and Y has less dispersion than X. This form of classification problem is akin to finding the proverbial needle in a haystack. Data structures like these are important in churn prediction analyses which will serve as a motivation for possible applications. Beyond the analytical investigations, comprehensive simulation studies will also be carried out.
Keywords: data depth; DD-plot; Mahalanobis depth function; support vector machines; SVM; binary classification; hybrid methods; rare events; data nest; churn prediction; big data.
International Journal of Computational Economics and Econometrics, 2021 Vol.11 No.2, pp.107 - 142
Received: 11 Feb 2019
Accepted: 24 Jul 2019
Published online: 27 Apr 2021 *