Title: Data pair selection for accurate classification based on information-theoretic metric learning
Authors: Takashi Maga; Kenta Mikawa; Masayuki Goto
Addresses: Graduate School of Creative Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan ' Department of Information Science, Shonan Institute of Technology, 1-1-25 Tsujidounishikaigan, Fujisawa City, Kanagawa 251-8511, Japan ' School of Creative Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
Abstract: Data classification is one of the main technique in data analysis which has become more and more important in various fields of business. Automatic classification is the problem that classification category label is learned from training data. One of the effective approaches for automatic classification is the k-nearest neighbour (kNN) method based on distances between data pairs, combining with the well-known distance metric learning. In this study, we focus on information-theoretic metric learning (ITML) method. In ITML, the optimisation problem is formulated as learning metric matrix so that the distance between each pair of data belonging to the same class becomes smaller than a constant, while the distance between each pair of data belonging to different classes becomes larger than the other constant. In this study, we propose an improved procedure by choosing the data-pairs which affect clarifying the boundaries effectively. We verify the effectiveness of our proposed method by conducting the simulation experiment with benchmark dataset.
Keywords: automatic classification; distance metric learning; Mahalanobis distance; information-theoretic metric learning; ITML.
Asian Journal of Management Science and Applications, 2017 Vol.3 No.1, pp.61 - 74
Received: 02 May 2016
Accepted: 28 Nov 2016
Published online: 08 Apr 2017 *