Authors: Duong Tuan Anh; Nguyen Van Nhat
Addresses: Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, 268 Ly Thuong Kiet, Dist. 10, Ho Chi Minh City, Vietnam ' Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology, 268 Ly Thuong Kiet, Dist. 10, Ho Chi Minh City, Vietnam
Abstract: The extended motif discovery (EMD) algorithm, developed by Tanaka et al. (2005), is a well-known time series motif discovery algorithm which is based on minimum description length principle. One unique feature of the EMD algorithm is that it can determine the suitable length of the motif and hence does not require the length of the motif as a parameter supplied by user. Another interesting feature of the EMD is the lengths of each instances of a motif can be a bit different from each other and hence Tanaka et al. suggested that dynamic time warping (DTW) distance should be used to calculate the distances between the motif instances in this case. Due to the second feature, the EMD algorithm leads to a high computational complexity and is not easy to implement in practice. In this paper, we present an efficient implementation of the EMD algorithm in which we apply homothetic transformation to convert all pattern instances of different lengths into the same length so that we can easily calculate Euclidean distances between them. This modification accelerates the execution of the EMD remarkably and makes it easier to implement. Experimental results on seven real world time series datasets demonstrate the effectiveness of our EMD implementation method in time series motif discovery.
Keywords: time series; extended motif discovery; EMD algorithm; homothetic transformation; Euclidean distances.
International Journal of Data Mining, Modelling and Management, 2016 Vol.8 No.2, pp.180 - 194
Received: 14 Feb 2014
Accepted: 13 Aug 2014
Published online: 21 Jun 2016 *