Title: ADBSCSL: adaptive DBSCAN-SMOTE with cost-sensitive learning to enhance diagnostic accuracy for imbalanced medical datasets
Authors: M. Kavitha; M. Kasthuri
Addresses: Department of Computer Science, Bishop Heber College, Tiruchirappalli – 620017, Tamil Nadu, India; Affiliated to: Bharathidasan University, India ' Department of Computer Science, Bishop Heber College, Tiruchirappalli – 620017, Tamil Nadu, India; Affiliated to: Bharathidasan University, India
Abstract: Medical diagnosis is complicated by imbalanced datasets, in which biased models cannot distinguish minority class cases like rare diseases. To improve diagnosis accuracy, this research introduces ADBSCSL, which stands for adaptive DBSCAN-SMOTE with cost-sensitive learning. Adaptive DBSCAN, SMOTE, and cost-sensitive learning handle skewed data well. Adaptive DBSCAN clusters minority class occurrences. It changes parameters to dataset density change. The diversity of the density condition cannot have caused the minority class to misidentify. SMOTE is then applied to these clusters to increase synthetic examples and class balance. It reduces misclassification costs using cost-sensitive learning. This pushes the model toward minority class priority and avoids majority class bias. The approach was evaluated on brain stroke, cerebral stroke, and autism spectrum disorder datasets. ADBSCSL F1-scores of 91.8% and 90.6% indicate accuracy over 90% on brain stroke and cerebral stroke datasets. On ASD datasets, it had 100% accuracy, precision, recall, and F1-score. Results show that the ADBSCSL increases classification performance, making it a powerful and efficient tool for medicine diagnosis with highly imbalanced datasets.
Keywords: imbalanced datasets; DBSCAN; SMOTE; cost-sensitive learning; machine learning; ML; diagnostic accuracy; imbalanced medical datasets.
DOI: 10.1504/IJCSE.2025.148742
International Journal of Computational Science and Engineering, 2025 Vol.28 No.5, pp.554 - 570
Received: 07 Dec 2023
Accepted: 19 Nov 2024
Published online: 22 Sep 2025 *