Title: Machine learning prediction of chronic diabetes based on a person's demography and lifestyle information

Authors: Asish Satpathy; Satyajit Behari

Addresses: Department of Information Systems, W. P. Carey School of Business, Arizona State University, Tempe, AZ 85287, USA ' Univar Solutions Inc., Downers Grove, IL 60515, USA

Abstract: Chronic diseases such as diabetes are prevalent globally and responsible for many deaths yearly. In addition, treatments for such chronic diseases account for a high healthcare cost. However, research has shown that diabetes can be proactively managed and prevented while lowering healthcare costs. We have mined a sample of ten million customers' 360° insight that includes behavioural, demographic, and lifestyle information, representing the state of Texas, USA, with attributes current as of late 2018. The sample, obtained from a market research data vendor, has over 1000 customer attributes consisting of behavioural, demographic, lifestyle, and, in some cases, self-reported chronic conditions such as diabetes or hypertension. In this study, we have developed a classification model to predict chronic diabetes with an accuracy of 80%. In addition, we demonstrate a use case where a large volume of customers' 360° data can be helpful to predict and hence proactively prevent and manage a person's chronic diabetes. Customer and person are both used interchangeably throughout the paper.

Keywords: data mining in health care; classification analysis with lifestyle and demographic data; customers' 360° insights; data mining for predicting diabetes.

DOI: 10.1504/IJDS.2022.127705

International Journal of Data Science, 2022 Vol.7 No.3, pp.210 - 228

Received: 09 Jun 2022
Accepted: 24 Jun 2022

Published online: 14 Dec 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article