Authors: Khaled M. Fouad
Addresses: Information Systems Department, Faculty of Computers and Informatics, Benha University, Egypt
Abstract: Gastrointestinal and liver diseases (GILDs) are the major causes of death and disability in Middle East. The investigation of upper gastrointestinal (GI) symptoms of a medically limited area resource is a challenge. The analysis of real-world clinical data of upper gastrointestinal (GI) using data mining techniques often is facing observations that the data contains missing values. In this paper, the proposed approach to missing data imputation is accomplished for categorical data onto upper GI diseases to apply the feature selection and classification algorithms with accurate and effective results for diagnosing upper GI diseases. This approach is evaluated by implementing experimental framework to apply five phases. These phases aim at partitioning the dataset to eight different datasets; with various ratio of missing data, performing the feature selection, imputing the missing data, classifying the imputed data, and finally, evaluating the outcome using k-fold cross validation for nine evaluation measures.
Keywords: data mining; data classification; feature selection; missing data imputation; categorical data mining; diagnosis of upper GI diseases.
International Journal of Advanced Intelligence Paradigms, 2023 Vol.24 No.3/4, pp.255 - 294
Received: 11 Nov 2017
Accepted: 07 Dec 2017
Published online: 01 Mar 2023 *