Title: A new process for healthcare big data warehouse integration

Authors: Nouha Arfaoui

Addresses: National Engineering School, Gabes, Tunisia

Abstract: Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.

Keywords: big healthcare data warehouse; BHDW; Hive; Hadoop; MapReduce; ontology; big data; ELT; ETL.

DOI: 10.1504/IJDMMM.2023.132974

International Journal of Data Mining, Modelling and Management, 2023 Vol.15 No.3, pp.240 - 254

Received: 02 Apr 2022
Accepted: 22 Oct 2022

Published online: 22 Aug 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article