A partition based method for finding highly correlated pairs Online publication date: Thu, 30-Sep-2010
by Shuxin Li, Sheau-Dong Lang
International Journal of Data Mining, Modelling and Management (IJDMMM), Vol. 2, No. 4, 2010
Abstract: The problem of finding highly correlated pairs is to output all item pairs whose (Pearson) correlation coefficients are greater than a user-specified correlation threshold. Effective discovery of such item pairs is of primary importance in many real data mining applications. Algorithm and Taper algorithm are special cases of our new algorithm with respect to the number of segments. Experimental results on real datasets demonstrate the feasibility and superiority of our algorithm. Recently, the Taper algorithm is developed to discover the set of highly correlated item pairs. In this paper, we present a generalised Taper algorithm to find strongly correlated pairs between items by partitioning the collection of transactions into different segments, so as to achieve better pruning effect and less running time. Consequently, it can be proved that both are naive.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining, Modelling and Management (IJDMMM):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com