Title: Rao DISC-based similarity coefficient: a measure of similarity with respect to feature differences

Authors: May Anne E. Mata

Addresses: Department of Mathematics, Physics and Computer Science, University of the Philippines Mindanao, Mintal, Davao City 8000, Philippines

Abstract: In this paper, I defined a similarity coefficient, called Rao DISC-based similarity coefficient (DbSC), which takes into consideration feature differences as a factor of measuring similarity. Such coefficient was based from Rao dissimilarity coefficient (DISC) and diversity coefficient (DIVC) which are mostly applicable to ecology. The performance of Rao DbSC was compared with the existing similarity coefficients using three different data sets. Principal coordinate analysis (PCoA) and Spearman|s rank correlation were made to demonstrate how Rao DbSC differs from other existing similarity coefficients. The obtained results gave emphasis on the relevance of considering the differences among features when comparing samples. Generally, this paper has illustrated the possibility of taking feature differences through some notion of distance as basis for determining similarity between samples.

Keywords: data analysis; PCoA; Rao dissimilarity coefficient; DISC; Rao diversity coefficient; DIVC; quadratic entropy; similarity coefficients; feature differences; principal coordinate analysis.

DOI: 10.1504/IJDATS.2010.032457

International Journal of Data Analysis Techniques and Strategies, 2010 Vol.2 No.2, pp.181 - 198

Published online: 03 Apr 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article