Authors: Abhirup Chakraborty, Ajit Singh
Addresses: Department of Electrical and Computer Engineering, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, N2L 3G1, Canada. ' Department of Electrical and Computer Engineering, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, N2L 3G1, Canada
Abstract: We consider the problem of processing exact results for sliding window joins over data streams with limited memory. Existing approaches deal with memory limitations by shedding loads, and therefore cannot provide exact or even highly accurate results for sliding window joins over data streams showing time varying rate of data arrivals. We provide an exact window join (EWJ) algorithm incorporating disk storage as an archive. Our algorithm spills window data onto the disk on a periodic basis, refines the output result by properly retrieving the disk resident data, and maximises output rate by employing techniques to manage the memory blocks. The problem of managing the window blocks in memory – similar in nature to the caching issue – captures both the temporal and frequency related properties of the stream arrivals. We provide experimental results demonstrating the performance and effectiveness of the proposed algorithm.
Keywords: data streams; join processing; sliding windows; performance; disk storage; memory limitations; exact window join; disk storage; archive; memory blocks.
International Journal of Intelligent Information and Database Systems, 2010 Vol.4 No.5, pp.462 - 486
Received: 25 May 2009
Accepted: 02 Feb 2010
Published online: 03 Oct 2010 *