Title: Augmentation of predictive competence of non-small cell lung cancer datasets through feature pre-processing techniques

Authors: M. Sumalatha; Latha Parthiban

Addresses: Department of Computer Science, Periyar University, Salem, Tamil Nadu, India ' Department of Computer Science, Pondicherry University Community College, Puducherry, India

Abstract: Non-small cell lung cancer (NSCLC) comprised of complex hidden and unknown data that is challenging for prediction at the earlier stage. The major objective of the research paper is to develop a novel preprocessing model based on minimisation of features and competency maximisation through feature pre-processing (FPP) to provide augmentation in predictive competence of NSCLC datasets. In Phase-I, the test for relevancy identified behavioural errors like null, empty and NAN values to reduce two features. In Phase-II, regression analysis was performed to find the relationship between features after which four features were removed. In Phase-III, cluster analysis is carried out to find the irrelevant features in the form of clusters and seven features are removed. The competency of NSCLC dataset before FPP showed more accuracy than after FPP with classifiers like simple tree, complex tree, linear SVM, Gaussian SVM, weighted KNN and boosted tree classifiers.

Keywords: non-small cell lung cancer; NSCLC; competency of prediction; relevancy analysis; regression analysis; cluster analysis; feature pre-processing model; feature pre-processing; FPP; competency analytics.

DOI: 10.1504/IJESMS.2023.129985

International Journal of Engineering Systems Modelling and Simulation, 2023 Vol.14 No.2, pp.86 - 100

Received: 02 Aug 2021
Accepted: 17 Nov 2021

Published online: 04 Apr 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article