Title: Automatic selection of lexical features for detecting Alzheimer's disease using bag-of-words model and genetic algorithm

Authors: Gang Lyu; Aimei Dong

Addresses: Changshu Institute of Technology, Suzhou, Jiangsu, China ' Qilu University of Technology, Jinan, Shandong, China

Abstract: Early detection of Alzheimer's disease is the key to treatment. Neuropsychological testing has the advantages of being non-invasive and low-cost, but the need for manual selection of features and expert diagnosis is not conducive to the popularity of this method. This paper proposes an approach for automatically extracting and selecting features from texts. First, it uses the bag-of-words model of natural language processing technology to extract all the vocabulary features in the texts. Secondly, unlike the manual selection of features by t-test, it uses the genetic algorithm to select lexical features automatically. We tested the new approach with the DementiaBank database. Its classification accuracy for Alzheimer's disease is 79%, close to the best value of the hand-crafted-feature-based method. The new approach also has the ability to process data quickly and automatically, which can greatly help clinicians improve their work.

Keywords: bag-of-words model; genetic algorithm; hyperparameter; machine learning; Naïve Bayes algorithm; Alzheimer's disease.

DOI: 10.1504/IJCAT.2019.103290

International Journal of Computer Applications in Technology, 2019 Vol.61 No.4, pp.306 - 311

Available online: 24 Oct 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article