Title: XGBoost-based prediction modelling and analysis for health literacy assessment

Authors: Yan Hong; Xiaoda Zhang; Jinxiang Chen

Addresses: School of Nursing, Inner Mongolia University for Nationalities, Tongliao, 028000, China ' Micron Intelligent Manufacturing Systems Science and Technology (Beijing) Co., Ltd., Beijing, 100086, China ' Micron Intelligent Manufacturing Systems Science and Technology Co., Ltd., Beijing, 100082, China

Abstract: Big data analysis and XGBoost modelling for health literacy prediction are investigated in this paper, which gives a new idea for health literacy assessment. 750 residents in Tongliao, Inner Mongolia, China are tested to answer 68 questions in three questionnaires about health literacy. A big dataset with 742 samples is constructed firstly. Every sample has 68 characteristics. Based on the dataset, BPNN and XGBoost prediction are established, respectively. R2 score obtained by XGBoost model is 0.97553, which is higher than one solved by BPNN. The relative error rates and absolute errors obtained by using XGBoost are less than 5% and 10 points, respectively. Therefore, XGBoost model is more effective than BPNN for predicting peoples' health literacy. The influence of every feature on the resident's health literacy score is calculated respectively by XGBoost, which can help analyse the influence of every question on residents' health literacy assessment.

Keywords: health literacy; prediction model; XGBoost; characteristics analysis; big dataset.

DOI: 10.1504/IJMIC.2021.123495

International Journal of Modelling, Identification and Control, 2021 Vol.39 No.3, pp.229 - 235

Received: 26 May 2021
Accepted: 27 Jun 2021

Published online: 23 Jun 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article