Title: A hybrid algorithm for mining local outliers in categorical data

Authors: Meiling Liu; Mingxuan Huang; Weidong Tang

Addresses: College of Software and Information Security, Guangxi University for Nationalities, Nanning 530006, China; Science Computing and Intelligent Information Processing of Guangxi Higher Education Key Laboratory, Nanning 530023, China ' College of Information and Statistics, Guangxi University of Finance and Economics, Nanning 530003, China ' College of Information Science and Engineering, Guangxi University for Nationalities, Nanning 530006, China

Abstract: Outlier detection is an important task in data mining. Many approaches have been developed to detect outliers. However, most researches focus on global outlier detection. In many situations, the local outlier detection is more valuable than the global outlier detection. In this paper, the existing methods for outlier detection are discussed firstly, and then the definition of local outlier and some formulas are given. Also a hybrid algorithm for mining local outlier is proposed which is based on clustering algorithm and standard deviation in statistics. By calculating the standard deviation of a cluster and local outlier factor of an object in the cluster, we can identify that the clusters with higher standard deviation may have outliers, and the objects with higher local outlier factor can be recognised as outliers. Experimental results on real datasets show that the proposed algorithm is correct and effective for mining local outliers.

Keywords: local outlier; standard deviation; local outlier factor; clustering; data mining.

DOI: 10.1504/IJWMC.2017.087342

International Journal of Wireless and Mobile Computing, 2017 Vol.13 No.1, pp.78 - 85

Received: 05 Aug 2016
Accepted: 16 Feb 2017

Published online: 03 Oct 2017 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article