Authors: Sai Li; Sirui Huang; Ya Zhou
Addresses: Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, 541004, China ' University of Liverpool, Liverpool L69 3BX, UK ' Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, 541004, China
Abstract: In the classification of toxic comment text, the minority classes are often surrounded by the majority classes. With classification through traditional methods, the accuracy of the majority class is high while that of the minority classes is low. Furthermore, traditional classification methods cannot detect the toxic comment on the minority classes. Therefore, a modified toxic behaviour detection model is proposed in this paper, which combined an improved SMOTE algorithm bi-LSTM, we named it AS-BL. In the first step, the dataset was pre-processed, and the features were extracted. Then, increased the minority classes in the toxic comment text via the improved SMOTE (AD-SMOTE) algorithm, using KNN to calculate the average sampling density and increase the number of minority comment samples from the data level. Finally, the text vector was introduced to the trained bi-LSTM model for detection. The results of the experiments showed that the model proposed in this paper outperformed the other existing models in the classification accuracy and improved the overall detection; therefore, the model is suitable for the actual network environment.
Keywords: toxic comment; behaviour detection; SMOTE algorithm; bi-LSTM neural network.
International Journal of Intelligent Internet of Things Computing, 2020 Vol.1 No.2, pp.114 - 128
Received: 20 Nov 2019
Accepted: 13 Dec 2019
Published online: 25 Sep 2020 *