Title: Patient reviews analysis using machine learning

Authors: Bijayalaxmi Panda; Chhabi Rani Panigrahi; Bibudhendu Pati

Addresses: Department of Computer Science, Rama Devi Women's University, Bhubaneswar, India ' Department of Computer Science, Rama Devi Women's University, Bhubaneswar, India ' Department of Computer Science, Rama Devi Women's University, Bhubaneswar, India

Abstract: In the present scenario, health-related tweets are available in several online forums for communication. Doctors as well as patients share their views in different discussion forums that help people seek similar information. In this work, an investigation was done by the authors on unstructured patient reviews collected from different forums regarding different diseases. The dataset was collected from Figshare which helps in identifying several features from the text provided by patients into numerical forms. Sentiment analysis (SA) was applied on the dataset to determine positive and negative tweets. Then, the bag-of-words model was used for disease detection. From the considered dataset, specific features were selected and machine learning classification algorithms such as support vector machine (SVM), Gaussian naive Bayes (NB), and random forest (RF) were applied to classify the features. Finally, the performance of classifiers was measured in terms of the parameters such as precision, recall, F1-score, support and accuracy. From the experimental results, it was found that RF results in higher accuracy of 98% as compared to SVM and Gaussian NB.

Keywords: classification; support vector machine; SVM; Gaussian naive Bayes; random forest; feature selection.

DOI: 10.1504/IJCSE.2023.129742

International Journal of Computational Science and Engineering, 2023 Vol.26 No.2, pp.111 - 117

Received: 01 Sep 2021
Accepted: 14 Jan 2022

Published online: 22 Mar 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article