Title: Extracted information quality, a comparative study in high and low dimensions

Authors: Leandro Ariza-Jiménez; Luisa F. Villa; Nicolás Pinel; O. Lucia Quintero

Addresses: Mathematical Modelling Research Group, Universidad EAFIT, Carrera 49 No. 7Sur – 50, Medellín, Colombia ' System Engineering Research Group, ARKADIUS, Universidad de Medellín, Carrera 87 No. 30 – 65, Medellín, Colombia ' Biodiversity, Evolution and Conservation Research Group, Universidad EAFIT, Carrera 49 No. 7Sur – 50, Medellín, Colombia ' Mathematical Modelling Research Group, Universidad EAFIT, Carrera 49 No. 7Sur – 50, Medellín, Colombia

Abstract: Uncovering interesting groups in either multidimensional or network spaces has become an essential mechanism for data exploration and understanding. Decision making requires relevant information as well as high-quality on the retrieved conclusions. We presented a comparative study of two compact representations drawn from the same set of data objects by clustering high-dimensional spaces and low-dimensional Barnes-Hut t-stochastic neighbour embeddings. There is no consensus on how the problem should be addressed and how these representations/models should be analysed because of their different notions. We introduced a measure to compare their results and capability to provide insights into the information retrieved. We considered low-dimensional embeddings as a potentially revealing strategy to uncover dynamics possibly not uncovered in big-data spaces. We demonstrated that a non-guided approach can be as revealing as a user-guided approach for data exploration and presented coherent results for good uncertainty modelling capability in terms of fuzziness and densities.

Keywords: high-dimensional clustering; BH-SNE embeddings; cluster fuzziness; reliable information; decision making; consistency.

DOI: 10.1504/IJBIDM.2021.117102

International Journal of Business Intelligence and Data Mining, 2021 Vol.19 No.2, pp.214 - 241

Received: 25 Feb 2019
Accepted: 08 Jul 2019

Published online: 21 Jul 2021 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article