Title: Predicting glioma grades: integrating clinical and molecular data with machine learning and explainable AI

Authors: Goksu Tuysuzoglu; Özge Kart Tokmak

Addresses: Department of Computer Engineering, Dokuz Eylul University, Izmir, 35390, Turkey ' Department of Computer Engineering, Dokuz Eylul University, Izmir, 35390, Turkey

Abstract: Glioma grading, a critical task in neuro-oncology, plays a pivotal role in treatment planning and prognosis determination. This study investigates the glioma grade prediction performance of machine learning models based on clinical and molecular data, and how it can be improved by data balancing and feature selection methods. Moreover, a probabilistic multi-view learning model (P-MWM) is introduced to predict glioma grading using clinical and molecular features. In order to improve the model interpretability, the Shapley additive explanations (SHAP) method is used for analysing and interpreting the contribution of each feature to the grading. The study's contributions lie in the development of the P-MWM model, leveraging feature selection methods using ANOVA's f-test, addressing imbalanced data issues, using SMOTE and SMOTE-Tomek Links, and improving model interpretability through SHAP. The proposed P-MWM model was noted to enhance the overall model performance, leading to improvement, particularly in the decision tree (DT) model culminating in an accuracy of 86.8918%. The individual logistic regression (LR) model combined with feature selection and data balancing techniques outperformed the other experimental settings by achieving 87.8442% accuracy.

Keywords: glioma grading; model interpretability; machine learning; feature selection; imbalanced data handling.

DOI: 10.1504/IJIEI.2024.142421

International Journal of Intelligent Engineering Informatics, 2024 Vol.12 No.4, pp.513 - 541

Received: 28 Dec 2023
Accepted: 04 Jun 2024

Published online: 30 Oct 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article