Title: A new process for healthcare big data warehouse integration
Authors: Nouha Arfaoui
Addresses: National Engineering School, Gabes, Tunisia
Abstract: Healthcare domain generates huge amount of data from different and heterogynous clinical data sources using different devices to ensure a good managing hospital performance. Because of the quantity and complexity structure of the data, we use big healthcare data warehouse for the storage first and the decision making later. To achieve our goal, we propose a new process that deals with this type of data. It starts by unifying the different data, then it extracts it, loads it into big healthcare data warehouse and finally it makes the necessary transformations. For the first step, the ontology is used. It is the best solution to solve the problem of data sources heterogeneity. We use, also, Hadoop and its ecosystem including Hive, MapReduce and HDFS to accelerate the treatment through the parallelism exploiting the performance of ELT to ensure the 'schema-on-read' where the data is stored before performing the transformation tasks.
Keywords: big healthcare data warehouse; BHDW; Hive; Hadoop; MapReduce; ontology; big data; ELT; ETL.
DOI: 10.1504/IJDMMM.2023.132974
International Journal of Data Mining, Modelling and Management, 2023 Vol.15 No.3, pp.240 - 254
Received: 02 Apr 2022
Accepted: 22 Oct 2022
Published online: 22 Aug 2023 *