Title: Comorbidities and risk factors impact of COVID-19 in Mexico: a feature utility metrics approach
Authors: Eduardo Emmanuel Rodríguez López; Daniel Hernández González; Francisco Javier Álvarez Rodríguez; Julio Cesar Ponce Gallegos
Addresses: Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico ' Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico ' Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico ' Universidad Autónoma de Aguascalientes, Aguascalientes, Mexico
Abstract: By applying Machine Learning, it is possible to determine the impact of the main comorbidities and risk factors associated with COVID-19 based on an analysis of official Mexican Secretary of Health data. This analysis was performed using Feature Utility Metrics: Mutual Information (MI), Permutation Importance (PI) and Partial Dependence Plot (PDP) with two different learning models (RandomForest and XGBoost), finding similarities between these metrics. According to these models, the main comorbidities and risk factors associated with COVID-19 are Age, Gender, Obesity, Diabetes and Hypertension. Regarding MI and PI (RandomForest), the main risk factor is Age, while for PI (XGBoost) is Obesity. Finally, the PDP graph for Age, shows that the associated probability of risk of COVID-19 infection increases considerably after 60-years-old. Therefore, it was confirmed that the main comorbidities and risk factors associated with COVID-19 in Mexico are coherent with the diseases and conditions most present in the population.
Keywords: comorbidities; COVID-19 risk factors; mutual information; permutation importance; feature utility metrics.
DOI: 10.1504/IJDMB.2021.124109
International Journal of Data Mining and Bioinformatics, 2021 Vol.26 No.1/2, pp.59 - 80
Received: 30 Nov 2021
Accepted: 28 Apr 2022
Published online: 13 Jul 2022 *