Title: Detection of abusive text on online social networks using ensemble clustering algorithm with heuristic strategy
Authors: C. Ravisheker; Manmohan Sharma; Santosh Kumar Henge
Addresses: School of Computer Applications, Lovely Professional University, Punjab, 144411, India ' School of Computer Applications, Lovely Professional University, Punjab, India ' Koneru Lakshmaiah Education Foundation, KL University, Green Fields, Vaddeswaram, Andhra Pradesh, 522302, India
Abstract: The research aims to explore hate speech or abusive text detection in social media platforms by using a clustering algorithm. Initially, the input text data undergoes the pre-processing phase that includes processes such as stemming, blank space removal, stop word removal, and punctuation removal, and then, the pre-processed text data are given as input to the feature extraction phase by utilising the term frequency-inverse document frequency (TFIDF) and Bag-of N-grams. Further, the extracted features are provided to the optimal feature selection phase, and the optimal features are obtained by using a new fitness-assisted random number-based rock hyraxes swarm optimisation (FR-RHSO). Finally, the optimal features are given to the clustering phase of ensemble clustering (EC) including techniques like K-means clustering (KMC), K-medoids clustering, and optimal clustering to obtain the outcomes accurately. From the investigation, the recommended method provides a reduced redundancy of abusive text, which makes this framework a reliable one. Throughout the experimental analysis, the accuracy and precision of the designed abusive text detection model attain 98% regarding optimal feature size. In the end, abusive text detection has effectively been done with reduced redundancy, and also it provides accurate results for the fake text detection model.
Keywords: abusive text detection; hate speech detection; term frequency-inverse document frequency; Bag-of N-grams; fitness-assisted random number-based rock hyraxes swarm optimisation; optimised ensemble clustering; K-means clustering; KMC; K-medoids clustering; optimal clustering.
DOI: 10.1504/IJCSE.2025.144822
International Journal of Computational Science and Engineering, 2025 Vol.28 No.2, pp.185 - 203
Received: 22 Sep 2022
Accepted: 25 Feb 2023
Published online: 03 Mar 2025 *