Authors: Li Zhao; Jingkun Liang
Addresses: School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China; Department of Information Engineering, Shijiazhuang Vocational Technology Institute, Shijiazhuang, China. ' Department of Information Engineering, Shijiazhuang Vocational Technology Institute, Shijiazhuang, China
Abstract: Most of the prototype reduction algorithms process the data in its entirety to yield a consistent subset, which is very useful in nearest neighbour classification. Their main disadvantage is the excessive computational cost when the prototype size is very large. In this paper, we present a cellular automata (CA)-based nearest neighbour rule condensation method to reduce useless points in a given training set. This method retains only the points on the boundary between different classes, and the amount of the reduced rules of the reference set can be revised by the granularity of the CA lattice. The main advantages of the proposed method are, on the one hand, that it is able to condense a given rule set within less time compared to other traditional algorithms. On the other hand, with the proposed algorithm, we can get a consistent subset of the given set in the divide-reduce-coalesce manner. Experiments show successful results when the size of the given dataset is large.
Keywords: nearest neighbour classification; cellular automata; rule condensation; consistent subset; large datasets.
International Journal of Computer Applications in Technology, 2012 Vol.44 No.2, pp.109 - 116
Published online: 23 Aug 2012 *Full-text access for editors Access for subscribers Purchase this article Comment on this article