Authors: Anthony Scime
Addresses: Computing Sciences Department, The College at Brockport State University of New York, 350 New Campus Dr. Brockport, NY 14420, USA
Abstract: Interesting classification rules can be determined by a number of measures. When searching a domain for a characterisation of unique, different, but important data an appropriate measurement is diversity. Diversity as a measure of a classification rule is based on the relative distinctness of the rule to the other rules in the rule-set. The diversity measure is the sum of the inverse of commonness of a rule's items. In this paper, diversity is derived from the simplest classification trees using techniques from statistics and information retrieval, and demonstrated using sample datasets.
Keywords: classification data mining; diversity; interestingness measurement; classification tree measurement; classification trees; data mining; classification rules; rule diversity; rule sets; interestingness.
International Journal of Data Science, 2018 Vol.3 No.2, pp.107 - 125
Available online: 27 May 2018 *Full-text access for editors Access for subscribers Purchase this article Comment on this article