Title: Predicting glioma grades: integrating clinical and molecular data with machine learning and explainable AI
Authors: Goksu Tuysuzoglu; Özge Kart Tokmak
Addresses: Department of Computer Engineering, Dokuz Eylul University, Izmir, 35390, Turkey ' Department of Computer Engineering, Dokuz Eylul University, Izmir, 35390, Turkey
Abstract: Glioma grading, a critical task in neuro-oncology, plays a pivotal role in treatment planning and prognosis determination. This study investigates the glioma grade prediction performance of machine learning models based on clinical and molecular data, and how it can be improved by data balancing and feature selection methods. Moreover, a probabilistic multi-view learning model (P-MWM) is introduced to predict glioma grading using clinical and molecular features. In order to improve the model interpretability, the Shapley additive explanations (SHAP) method is used for analysing and interpreting the contribution of each feature to the grading. The study's contributions lie in the development of the P-MWM model, leveraging feature selection methods using ANOVA's f-test, addressing imbalanced data issues, using SMOTE and SMOTE-Tomek Links, and improving model interpretability through SHAP. The proposed P-MWM model was noted to enhance the overall model performance, leading to improvement, particularly in the decision tree (DT) model culminating in an accuracy of 86.8918%. The individual logistic regression (LR) model combined with feature selection and data balancing techniques outperformed the other experimental settings by achieving 87.8442% accuracy.
Keywords: glioma grading; model interpretability; machine learning; feature selection; imbalanced data handling.
DOI: 10.1504/IJIEI.2024.142421
International Journal of Intelligent Engineering Informatics, 2024 Vol.12 No.4, pp.513 - 541
Received: 28 Dec 2023
Accepted: 04 Jun 2024
Published online: 30 Oct 2024 *