Authors: Mohammad Ali Kadampur; D.V.L.N. Somayajulu
Addresses: National Institute of Technology Warangal, Warangal 506004, India ' National Institute of Technology Warangal, Warangal 506004, India
Abstract: Dissimilarity calculation between two objects is one of the important knowledge gathering methods in cognition science. Many data mining algorithms explore dissimilarity computation to cluster the data in order to know intra-relations, inter-relations, and outliers in the data. Majority of these algorithms use Euclidean distance as the dissimilarity criterion. In this paper, signal transformation functions, with their orthogonal property and energy compaction features are explored in transforming the data. The data transformation scheme considers entire data as a single entity. The proposed scheme is designed such that it can be used even for the non-Euclidean space by using the distance mapping algorithm. The existing randomisation approaches for data transformation maintain only the distributions and do not maintain the Euclidean distance between the records. The proposed methods are superior to the existing methods in terms of run time complexity O(n) and preservation of distance between individual data points.
Keywords: privacy preservation; privacy protection; data perturbation; wavelet transforms; data mining; data transformation; distance-based mining; signal transformation functions; Euclidean distance.
International Journal of Data Mining, Modelling and Management, 2014 Vol.6 No.3, pp.285 - 311
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 15 Oct 2014 *