Title: A clustering and TreeMap-based approach for query reuse and visualisation in large data repositories

Authors: Yousra Harb; Surendra Sarnikar; Omar El-Gayar

Addresses: Yarmouk University, Irbid 21163, Jordan ' California State University East Bay, Hayward, CA 94542, USA ' Dakota State University, Madison, SD 57042, USA

Abstract: The main objective of this paper is to develop a system to support data exploration tasks over large data repositories. In order to leverage the large amounts of data being generated, decision makers need to first explore the available data and understand its potential for helping with decision problems. In this paper, we present a query clustering and tree map approach that supports data exploration tasks through knowledge reuse, multiple data navigation paths and an easy to use point and click interface. We demonstrate the viability of the approach by building a prototype data exploration interface for health data from behavioural risk factor surveillance system (BRFSS). We evaluate the effectiveness of the artefact using cognitive walkthroughs and a user study. The results indicate that the proposed system is easier to use and reduces user effort for data exploration tasks when compared to a baseline faceted information system.

Keywords: query clustering; query reuse; query visualisation; query exploration; information retrieval; TreeMap.

DOI: 10.1504/IJBIDM.2021.118187

International Journal of Business Intelligence and Data Mining, 2021 Vol.19 No.3, pp.267 - 290

Received: 06 Mar 2019
Accepted: 29 Jul 2019

Published online: 29 Sep 2021 *

