Authors: Dillon Chrimes; Mu-Hsing Kuo; Belaid Moa; Wei Hu
Addresses: Vancouver Island Health Authority, University of Victoria, Victoria, BC, Canada ' School of Health Information Science, University of Victoria, BC, Canada; School of Medicine, Stanford University, CA, USA ' Compute Canada/WestGrid, University of Victoria, BC, Canada ' State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu, China
Abstract: We established a framework construct to form a big data analytics (BDA) platform using real volumes of health big data. Existing high-performance computing (HPC) architecture was utilised with HBase (noSQL database) and Hadoop (HDFS). Generated noSQL database was emulated from metadata and inpatient profiles of Vancouver Island Health Authority's hospital system. Special adjustments of Hadoop's ecosystem and HBase with the addition of 'salt buckets' to ingest were required. Results revealed that HBase took a week's time to generate ∼10 TB of data for one billion records via ingestion. Hadoop ingestion into HBase only took three seconds. Both simple and complex queries were less than two seconds, and all queries produced accurate patient data results. Data migration performance requirements of our BDA platform can significantly capture large volumes of data while reducing data retrieval times and its linkages to innovative processes and configurations that met patient data security/privacy standards are discussed.
Keywords: big data analytics; BDA; data mining; healthcare technology; high performance computing; HPC; patient data; simulation; HBase; Hadoop; data retrieval; data security; data privacy; privacy protection; privacy preservation.
International Journal of Big Data Intelligence, 2017 Vol.4 No.2, pp.61 - 80
Received: 16 Feb 2016
Accepted: 22 Apr 2016
Published online: 21 Mar 2017 *