Title: Detecting the risk of COVID-19 spread in near real-time using social media

Authors: Mohammed Ahsan Raza Noori; Bharti Sharma; Ritika Mehra

Addresses: School of Computing, DIT University, Dehradun, 248009, Uttarakhand, India ' School of Computing, DIT University, Dehradun, 248009, Uttarakhand, India ' School of Computer Science and Engineering, Dev Bhoomi Uttarakhand University, Dehradun, 248007, Uttarakhand, India

Abstract: COVID-19 is a contagious disease caused by SARS-CoV-2, and WHO recommended preventive measures like social distancing, testing, lockdowns, face masks, etc. to limit its spread. Failure to implement and monitor these measures increases the risk of spread and mortality rates. In this paper, a near real-time system using Twitter for detecting the risk of COVID-19 spread is proposed. The system uses Apache Spark framework for text mining, machine learning, and near real-time processing of data from Twitter. Five base machine learning classifiers: support vector machine (SVM), logistic regression (LR), multilayer perceptron (MLP), decision tree (DT), and Naive Bayes (NB) are combined to form an ensemble majority voting classifier (EMVC). Results show that the EMVC achieved an accuracy of 94.76%. Then, the proposed system is tested in real-time for detecting tweets related to the risk of COVID-19 spread in London, Mumbai, and New York in June 2020.

Keywords: COVID-19; coronavirus; risk detection; social media; Twitter; machine learning; ensemble learning; near real-time system; Apache Spark.

DOI: 10.1504/IJEM.2023.131940

International Journal of Emergency Management, 2023 Vol.18 No.2, pp.202 - 223

Received: 15 Feb 2021
Accepted: 10 Mar 2022

Published online: 05 Jul 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article