Title: Classification diversity measurement

Authors: Anthony Scime

Addresses: Computing Sciences Department, The College at Brockport State University of New York, 350 New Campus Dr. Brockport, NY 14420, USA

Abstract: Interesting classification rules can be determined by a number of measures. When searching a domain for a characterisation of unique, different, but important data an appropriate measurement is diversity. Diversity as a measure of a classification rule is based on the relative distinctness of the rule to the other rules in the rule-set. The diversity measure is the sum of the inverse of commonness of a rule's items. In this paper, diversity is derived from the simplest classification trees using techniques from statistics and information retrieval, and demonstrated using sample datasets.

Keywords: classification data mining; diversity; interestingness measurement; classification tree measurement; classification trees; data mining; classification rules; rule diversity; rule sets; interestingness.

DOI: 10.1504/IJDS.2018.092283

International Journal of Data Science, 2018 Vol.3 No.2, pp.107 - 125

Accepted: 14 May 2016
Published online: 14 Jun 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article