Title: A parallel tag affinity computation for social tagging systems using MapReduce

Authors: Hyunwoo Kim; Taewhi Lee; Hyoung-Joo Kim

Addresses: School of Computer Science and Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 151-742, Korea ' School of Computer Science and Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 151-742, Korea ' School of Computer Science and Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 151-742, Korea

Abstract: Tag affinity is the relationship between tags. It is a useful information for search and recommendation in social tagging systems. Tag affinity is measured by several types of tag cooccurrence frequency. The computation of tag affinity is a time-consuming task as the tagging information is accumulated. To alleviate this problem, we propose a parallel tag affinity computation method using MapReduce. We present MapReduce algorithms for computing three types of tag affinity measures: macro, micro, and bigram tag cooccurrence frequency. Our experimental results show that the proposed MapReduce-based approach not only significantly outperforms existing methods based on a relational database but also provides high scalability. To the best of our knowledge, this approach is the first tag affinity computation on MapReduce.

Keywords: parallelisation; social tagging; MapReduce; Hadoop; parallel tag affinity; tag cooccurrence frequency; bigram; big data.

DOI: 10.1504/IJBDI.2014.066322

International Journal of Big Data Intelligence, 2014 Vol.1 No.3, pp.141 - 150

Received: 21 Nov 2013
Accepted: 05 Apr 2014

Published online: 30 Dec 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article