Title: Automatic selection of lexical features for detecting Alzheimer's disease using bag-of-words model and genetic algorithm
Authors: Gang Lyu; Aimei Dong
Addresses: Changshu Institute of Technology, Suzhou, Jiangsu, China ' Qilu University of Technology, Jinan, Shandong, China
Abstract: Early detection of Alzheimer's disease is the key to treatment. Neuropsychological testing has the advantages of being non-invasive and low-cost, but the need for manual selection of features and expert diagnosis is not conducive to the popularity of this method. This paper proposes an approach for automatically extracting and selecting features from texts. First, it uses the bag-of-words model of natural language processing technology to extract all the vocabulary features in the texts. Secondly, unlike the manual selection of features by t-test, it uses the genetic algorithm to select lexical features automatically. We tested the new approach with the DementiaBank database. Its classification accuracy for Alzheimer's disease is 79%, close to the best value of the hand-crafted-feature-based method. The new approach also has the ability to process data quickly and automatically, which can greatly help clinicians improve their work.
Keywords: bag-of-words model; genetic algorithm; hyperparameter; machine learning; Naïve Bayes algorithm; Alzheimer's disease.
International Journal of Computer Applications in Technology, 2019 Vol.61 No.4, pp.306 - 311
Received: 22 Jan 2019
Accepted: 02 Mar 2019
Published online: 24 Oct 2019 *