Title: Detection of abusive text on online social networks using ensemble clustering algorithm with heuristic strategy

Authors: C. Ravisheker; Manmohan Sharma; Santosh Kumar Henge

Addresses: School of Computer Applications, Lovely Professional University, Punjab, 144411, India ' School of Computer Applications, Lovely Professional University, Punjab, India ' Koneru Lakshmaiah Education Foundation, KL University, Green Fields, Vaddeswaram, Andhra Pradesh, 522302, India

Abstract: The research aims to explore hate speech or abusive text detection in social media platforms by using a clustering algorithm. Initially, the input text data undergoes the pre-processing phase that includes processes such as stemming, blank space removal, stop word removal, and punctuation removal, and then, the pre-processed text data are given as input to the feature extraction phase by utilising the term frequency-inverse document frequency (TFIDF) and Bag-of N-grams. Further, the extracted features are provided to the optimal feature selection phase, and the optimal features are obtained by using a new fitness-assisted random number-based rock hyraxes swarm optimisation (FR-RHSO). Finally, the optimal features are given to the clustering phase of ensemble clustering (EC) including techniques like K-means clustering (KMC), K-medoids clustering, and optimal clustering to obtain the outcomes accurately. From the investigation, the recommended method provides a reduced redundancy of abusive text, which makes this framework a reliable one. Throughout the experimental analysis, the accuracy and precision of the designed abusive text detection model attain 98% regarding optimal feature size. In the end, abusive text detection has effectively been done with reduced redundancy, and also it provides accurate results for the fake text detection model.

Keywords: abusive text detection; hate speech detection; term frequency-inverse document frequency; Bag-of N-grams; fitness-assisted random number-based rock hyraxes swarm optimisation; optimised ensemble clustering; K-means clustering; KMC; K-medoids clustering; optimal clustering.

DOI: 10.1504/IJCSE.2025.144822

International Journal of Computational Science and Engineering, 2025 Vol.28 No.2, pp.185 - 203

Received: 22 Sep 2022
Accepted: 25 Feb 2023

Published online: 03 Mar 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article