Title: Towards a real-time big data analytics platform for health applications
Authors: Dillon Chrimes; Mu-Hsing Kuo; Belaid Moa; Wei Hu
Vancouver Island Health Authority, University of Victoria, Victoria, BC, Canada
School of Health Information Science, University of Victoria, BC, Canada; School of Medicine, Stanford University, CA, USA
Compute Canada/WestGrid, University of Victoria, BC, Canada
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu, China
Abstract: We established a framework construct to form a big data analytics (BDA) platform using real volumes of health big data. Existing high-performance computing (HPC) architecture was utilised with HBase (noSQL database) and Hadoop (HDFS). Generated noSQL database was emulated from metadata and inpatient profiles of Vancouver Island Health Authority's hospital system. Special adjustments of Hadoop's ecosystem and HBase with the addition of 'salt buckets' to ingest were required. Results revealed that HBase took a week's time to generate ∼10 TB of data for one billion records via ingestion. Hadoop ingestion into HBase only took three seconds. Both simple and complex queries were less than two seconds, and all queries produced accurate patient data results. Data migration performance requirements of our BDA platform can significantly capture large volumes of data while reducing data retrieval times and its linkages to innovative processes and configurations that met patient data security/privacy standards are discussed.
Keywords: big data analytics; BDA; data mining; healthcare technology; high performance computing; HPC; patient data; simulation; HBase; Hadoop; data retrieval; data security; data privacy; privacy protection; privacy preservation.
Int. J. of Big Data Intelligence, 2017 Vol.4, No.2, pp.61 - 80
Submission date: 16 Feb 2016
Date of acceptance: 22 Apr 2016
Available online: 21 Mar 2017