Title: Large-scale spectral clustering for managing big data in healthcare operations
Authors: Maoqing Liu; Nasser Fard; Keivan Sadeghzadeh
Samsung Neurologica Corporation, Danvers, MA 01923, USA
Department of Mechanical and Industrial, Northeastern University, Boston, MA 02115, USA
Sloan School of Management, MIT, Cambridge, MA 02142, USA
Abstract: Healthcare industries have access to a large volume and variety of data about patients' behaviours, diseases, and treatments. There is a significant need for a data-driven system to discover patterns for better understanding of the impact of human risk behaviours on numerous diseases. In order to discover and extract interesting knowledge and pattern from large amount of data, a data mining process for discovering knowledge from unprocessed and raw healthcare data is studied. Methods for analysis of big data, and the role and types of clustering methods are presented. An in-depth analysis of spectral clustering method as a superior clustering algorithm for big healthcare data is presented. The spectral clustering algorithm is applied to a large healthcare data from the behavioural risk factor surveillance system (BRFSS), by partitioning the untrained data to at least four clusters. The MATLAB® R2011b programming environment is utilised as a calculation tool in the experimental design and analysis. The experimental results and analysis, and the implementation process are discussed and the data processing is presented. Sensitivity analysis for both parameters of the spectral clustering are performed to determine their influence on the clustering results.
Keywords: big data; healthcare; spectral clustering; visualisation.
Int. J. of Big Data Intelligence, 2017 Vol.4, No.3, pp.195 - 207
Date of acceptance: 17 Sep 2016
Available online: 28 Jul 2017