Title: Auditing data streams for correlated glitches
Authors: Ji Meng Loh; Tamraparni Dasu
Addresses: New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA ' AT&T Labs Research, 180 Park Avenue, Florham Park, NJ 07976, USA
Abstract: Cellular networks carry massive volumes of voice, text and data traffic every second. The networks are monitored constantly to measure network performance, detect traffic congestion, identify anomalies, and to serve other customer service and network support functions. Data collected from mobility networks is used to make many critical decisions. The quality of the information plays an important role in the effectiveness of these decisions. Therefore, it is important to ensure that the data collected from cellular networks meet quality standards. In particular, identifying glitches that are correlated can help in isolating root causes and facilitate more efficient problem solving in the network, as well as quicker data repairs. In this paper, we present a methodology for automated auditing of massive, complex data streams with a focus on correlated glitches, and a case study that illustrates the application of this methodology. The methodology has two main components: a set of logical constraints that embody domain specific information, and statistical methods for identifying correlated glitches to enable automated quantitative cleaning of data. Together, the two components provide a comprehensive yet customisable set of criteria for evaluating information quality as a function of time and network topology.
Keywords: data quality; correlated glitches; automated detection; data stream mining; spatio-temporal analysis; hierarchical data; information quality; network monitoring; mobile networks; cellular networks; data collection; automated auditing; data stream auditing.
International Journal of Information Quality, 2013 Vol.3 No.2, pp.85 - 106
Received: 27 Mar 2012
Accepted: 08 Sep 2012
Published online: 26 Jul 2014 *