Title: Comorbidities and risk factors impact of COVID-19 in Mexico: a feature utility metrics approach

Authors: Eduardo Emmanuel Rodríguez López; Daniel Hernández González; Francisco Javier Álvarez Rodríguez; Julio Cesar Ponce Gallegos

Addresses: Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico ' Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico ' Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico ' Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico

Abstract: By applying Machine Learning, it is possible to determine the impact of the main comorbidities and risk factors associated with COVID-19 based on an analysis of official Mexican Secretary of Health data. This analysis was performed using Feature Utility Metrics: Mutual Information (MI), Permutation Importance (PI) and Partial Dependence Plot (PDP) with two different learning models (RandomForest and XGBoost), finding similarities between these metrics. According to these models, the main comorbidities and risk factors associated with COVID-19 are Age, Gender, Obesity, Diabetes and Hypertension. Regarding MI and PI (RandomForest), the main risk factor is Age, while for PI (XGBoost) is Obesity. Finally, the PDP graph for Age, shows that the associated probability of risk of COVID-19 infection increases considerably after 60-years-old. Therefore, it was confirmed that the main comorbidities and risk factors associated with COVID-19 in Mexico are coherent with the diseases and conditions most present in the population.

Keywords: comorbidities; COVID-19 risk factors; mutual information; permutation importance; feature utility metrics.

DOI: 10.1504/IJDMB.2021.124109

International Journal of Data Mining and Bioinformatics, 2021 Vol.26 No.1/2, pp.59 - 80

Received: 30 Nov 2021
Accepted: 28 Apr 2022

Published online: 13 Jul 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article