Title: Data preprocessing based on missing value and discretisation

Authors: Neeta Yadav; Neelendra Badal

Addresses: KNIT, Sultanpur, India ' KNIT, Sultanpur, India

Abstract: In the real world, data is not available in the appropriate form for mining or extracting information from this. Generally, real-world data is incomplete, inconsistent and dirty so it is very necessary to process data smartly according to the requirement of the dataset. Preprocessing is one of the most crucial steps in data mining and most of the time spent in this about 60% of the time. Unprocessed data takes lots of time in mining. End-user wasted lots of time in getting the desired result. So it is very necessary to process data according to the specific dataset by applying techniques of processing and thereby it reduces the overall mining time, the end user gets the desired result more fastly. In this paper, preprocessing of missing value and discretisation has been done. Preprocessing of missing value handle by three techniques that is a deletion, replacement by mean or averages, and prediction method. From these three techniques, user opt the best technique for handling missing value, which gives maximum accuracy and takes less time for preprocessing. After handling the missing value, discretisation is done for data reduction so it minimises the preprocessing time.

Keywords: deletion; discretisation; missing value; prediction; replacement by mean.

DOI: 10.1504/IJFSE.2020.110584

International Journal of Forensic Software Engineering, 2020 Vol.1 No.2/3, pp.193 - 214

Received: 13 Aug 2018
Accepted: 14 May 2019

Published online: 26 Oct 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article