Title: Feature selection in accident data: an analysis of its application in classification algorithms

Authors: Amrita Sarkar; G. Sahoo; U.C. Sahoo

Addresses: Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi-835215, India ' Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi-835215, India ' School of Infrastructure, Indian Institute of Technology, Bhubaneswar, Odisha-751013, India

Abstract: Feature selection is aimed to select a reducing number of subset features with high predictive information and remove irrelevant features with minimal predictive information. In this paper, we propose an ensemble approach for selecting features, using multiple feature selection techniques and combining the same to yield more robust and stable results. Multiple feature ranking techniques assemblage is performed in two steps. The first step necessitates creating a set of different feature selectors while the second step combines the results of all feature ranking techniques. The application of this method has been tested using accident dataset to increase predictive performance of accident in Kolkata. After the feature selection methods, this paper also explains significance of data mining classification algorithms to build classification models on the accident datasets with various selected subset of features. Further, the classification models are assessed in terms of the AUC performance metric.

Keywords: feature selection; feature ranking; classification algorithms; accident data analysis; ensemble ranking; India; Kolkata; road traffic accidents.

DOI: 10.1504/IJDATS.2016.077484

International Journal of Data Analysis Techniques and Strategies, 2016 Vol.8 No.2, pp.108 - 121

Received: 21 Apr 2014
Accepted: 04 Jul 2014

Published online: 04 Jul 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article