Title: Global-local term fusion for optimised community Q&A topic modelling
Authors: Sneh Prabha; Neetu Sardana
Addresses: Department of Computer Science and Engineering and Information Technology, Jaypee Institute of Information Technology, Noida, India ' Department of Computer Science and Engineering and Information Technology, Jaypee Institute of Information Technology, Noida, India
Abstract: Community question and answer (Q&A) websites have become invaluable information and knowledge-sharing sources. Effective topic modelling on these platforms is crucial for organising and navigating the vast amount of user-generated content. To address these challenges, we propose a novel global-local term fusion with optimised community (GLOCOM) Q&A topic modelling approach that leverages both local and global term importance to enhance topic modelling on community Q&A websites. GLOCOM combines term frequency-inverse document frequency for local importance and entropy for global importance. Further, we employ fuzzy clustering to enhance the representation of multifaceted topics. Furthermore, clustering results are optimised using a genetic algorithm (GA) to refine cluster assignments and centroids. We compared the proposed model with baseline models LDA and FLSA. GLOCOM has performed consistently well for all topic numbers. It has shown an improvement of 8.86% in silhouette score as compared to LDA and excelled for datasets with size > 3 MB.
Keywords: topic modelling; online community websites; Stack Overflow; latent Dirichlet allocation; LDA; fuzzy C-means; FCM; genetic algorithm; GA.
DOI: 10.1504/IJICA.2024.143406
International Journal of Innovative Computing and Applications, 2024 Vol.15 No.1, pp.50 - 69
Received: 29 Apr 2024
Accepted: 02 Oct 2024
Published online: 17 Dec 2024 *