Title: Analysis and prediction of breast cancer through feature selection and classification techniques

Authors: E. Sivasankar; A. Sathish Kumar; J. Sanjivi; P. Balasubramanian

Addresses: Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India ' Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India ' Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India ' Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India

Abstract: In this modern era, rapid research is being conducted in the field of medical sciences, with datasets of patients regarding their symptoms and their corresponding disease being readily available to the common man through the internet. This paper aims to contribute to this boom in the field through the application of data mining and machine learning techniques. In this work, breast cancer dataset is used from UCI repository for breast cancer prediction. The dimensions of the dataset were significantly reduced through feature selection techniques including both filter as well as wrapper-based techniques. Various classification algorithms, which include Naive Bayes, support vector machines, logistic regression, decision tree and boosting algorithm, were then applied and their accuracies were compared. Boosting algorithm provides the better accuracy compared with base classifiers.

Keywords: breast cancer; biomarkers for breast cancer; feature selection; classification techniques; machine learning; data mining; predictive modelling.

DOI: 10.1504/IJMEI.2021.117731

International Journal of Medical Engineering and Informatics, 2021 Vol.13 No.5, pp.359 - 375

Received: 29 Mar 2019
Accepted: 20 Nov 2019

Published online: 23 Sep 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article