Title: Enhancing Malayalam question classification in question answering systems: a comparative study of SVM, KNN, and multinomial NB
Authors: P.A. Bibin; R. Ravisekhar; P. Babu Anto
Addresses: Department of Computer Science, St Pius X College, Rajapuram Kasaragod, Kerala, 671532, India ' Department of Management, Jain University, Bangalore, Karnataka, 560069, India ' Department of Information Technology, Kannur University, Kannur, Kerala, 670567, India
Abstract: The method of question classification, involving the analysis and assignment of questions to specific categories, has gained momentum due to increased online activity, prompting interest in automating this process into predefined categories. The study focuses on developing a machine learning-based model for classifying question types in a Malayalam Question Answering System (QAS). It begins with systematic preprocessing of the dataset and feature extraction, followed by partitioning into training and testing sets. Three machine learning algorithms including support vector machine (SVM), multinomial Naive Bayes (MNB), and K-nearest neighbour (KNN) are implemented and optimised using various hyper-parameters. The evaluation employs metrics like accuracy, precision, recall, F1-score, and confusion matrices to assess performance comprehensively. Results indicate that the SVM classifier achieves the highest accuracy among the models tested. The research underscores the effectiveness of machine learning techniques in automating question classification, especially in diverse linguistic contexts like Malayalam, facilitating more efficient question-answering systems.
Keywords: Malayalam question answering; MNB; multinomial Naïve Bayes; SVM; support vector machine; KNN; K-nearest neighbour; question classification; machine learning; TF-IDF; term frequency-inverse document frequency; n-gram.
DOI: 10.1504/IJAACS.2025.148533
International Journal of Autonomous and Adaptive Communications Systems, 2025 Vol.18 No.4, pp.357 - 379
Received: 05 Jan 2024
Accepted: 25 Jun 2024
Published online: 11 Sep 2025 *