Title: Semi-supervised learning methods for large scale healthcare data analysis

Authors: Gang Zhang; Shan-Xing Ou; Yong-Hui Huang; Chun-Ru Wang

Addresses: School of Automation, Guangdong University of Technology, Guangzhou, China ' Department of Radiology, Guangzhou General Hospital of Guangzhou Military Command, Guangzhou, China ' School of Automation, Guangdong University of Technology, Guangzhou, China ' School of Automation, Guangdong University of Technology, Guangzhou, China

Abstract: With the development of information technology in healthcare industry, more and more data has been generated and stored electronically. To fully exploit the information and knowledge, data mining and machine learning methods have been developed and studied. We notice that a large body of healthcare data are lack of supervised information which requires expense human efforts in labelling or scoring them so as to be analysed in a data mining model. In this article, we address the problem of making use of unlabeled or un-scored data, together with only a few supervised data, to improve the performance of analysis model for healthcare decision making. This kind of paradise is called semi-supervised learning in machine learning literatures. We focus on semi-supervised kernel learning and propose to apply the learned kernel in two algorithms, i.e., support vector machine (SVM) and kernel regularised least squares (KRLS). The evaluation results on two publicly available healthcare dataset illustrate the effectiveness of the proposed framework.

Keywords: machine learning; semi-supervised learning; healthcare data analysis; support vector machines; SVM; kernel regularised least squares; KRLS; data mining.

DOI: 10.1504/IJCIH.2015.069788

International Journal of Computers in Healthcare, 2015 Vol.2 No.2, pp.98 - 110

Received: 20 May 2014
Accepted: 17 Feb 2015

Published online: 11 Jun 2015 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article