CNODE: clustering of set-valued non-ordered discrete data
by Sunil Kumar, Shamik Sural, Alok Watve, Sakti Pramanik
International Journal of Data Mining, Modelling and Management (IJDMMM), Vol. 1, No. 3, 2009

Abstract: This paper introduces a clustering technique named 'Clustering of set-valued Non-Ordered DiscretE data' (CNODE), in which each data item is a vector having a set of non-ordered discrete values per dimension. Since usual definitions of distance like Euclidean and Manhattan do not hold for 'non-ordered discrete data space' (NDDS), other measures like Hamming distance are often used to define distance between vectors having single-valued discrete dimensions. Such type of distance is not meaningful for set-valued dimensions and hence, we propose a similarity measure based on set intersection for clustering set-valued vectors. We also suggest a new measure for determining quality of clustering named 'lines of clustroids' (LOC) for this type of data. In contrast to other existing clustering techniques in NDDS, CNODE does not rely on any kind of pre-processing of dataset. Experiments with synthetic and real datasets show that CNODE is robust to data variations, scalable to large dataset size and efficient for high dimensions.

Online publication date: Sun, 19-Jul-2009

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining, Modelling and Management (IJDMMM):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com