Title: A statistical data mining approach in bacteriology for bacterial identification

Authors: S.M. Monzurur Rahman, F.A. Siddiky, Uma Shrestha

Addresses: School of Computer Science and Engineering, United International University, Road 8/A Dhanmondi, Dhaka-1209, Bangladesh. ' School of Computer Science and Engineering, United International University, Road 8/A Dhanmondi, Dhaka-1209, Bangladesh. ' Department of Science and Technology, University of New England, 54-158 Sussex Street, Sydney, NSW 2000, Australia

Abstract: Statistical data mining is one of the popular research fields of exploring valuable information from the large number of collected data. Applying statistical data mining techniques in several fields like medicine, bioinformatics, business, and bacteriology can be beneficial. Among them, bacteriology is one of the promising fields where statistical data mining is hardly ever used. The main purpose of this paper is to demonstrate the contribution of statistical data mining in the field of bacteriology. The research problem named as bacterial identification from bacteriology that we handle in this paper is a special kind of classification problem of statistical data mining where the only single representation of every class is present in the dataset. After studying this research problem, this paper proposes a novel statistical data mining approach using the decision tree technique in bacterial identification with better performance. The experimental results show significantly less number of biochemical tests are needed in bacterial identification using this proposed approach than the conventional approach that is being followed currently in the biochemical laboratory. Thus, the proposed approach not only benefits microbiologists, but it also improves the traditional approach of bacterial identification by saving time, total cost, and manual labour involvements.

Keywords: statistical data mining; SDM; biology; bacteriology; bioinformatics; classification; decision tree; bacterial identification.

DOI: 10.1504/IJDATS.2011.039847

International Journal of Data Analysis Techniques and Strategies, 2011 Vol.3 No.2, pp.117 - 142

Published online: 22 Apr 2011 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article