Title: A novel dynamic feature-based co-training approach for identification of complaint tweets

Authors: Pranali Yenkar; Sudhir D. Sawarkar

Addresses: Datta Meghe College of Engineering, Sector 3, Airoli, Navi Mumbai, India ' Datta Meghe College of Engineering, Sector 3, Airoli, Navi Mumbai, India

Abstract: Sensing civic issues by performing social media analytics has gathered lot of attention owing to social media's fast accessibility and instant reach. Experiences shared on Twitter help government to proactively take timely decisions. However, it becomes very challenging to classify only complaint tweets from excessive tweets. Existing studies used supervised learning algorithm but with manually created training data which is a very laborious process. So, we propose a novel co-training semi supervised algorithm using dynamic features to create huge amount of accurate training data by utilising limited labelled samples. Against the existing fixed static features, proposed dynamic features get recalculated using initial and newly labelled data after each iteration of the algorithm until all unlabeled data get labelled and hence become more relevant to distinguish between complaint and non-complaint. Verified on the tweets shared by citizens of Mumbai, result with 94% accuracy and 93% F1 score shows our approach is very promising.

Keywords: Twitter; urban issues; co-training semi supervised learning; natural language processing.

DOI: 10.1504/IJBIS.2024.140433

International Journal of Business Information Systems, 2024 Vol.46 No.3, pp.289 - 309

Received: 09 Mar 2020
Accepted: 08 Mar 2021

Published online: 09 Aug 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article