Title: Needle in a haystack: an empirical study on mining tags from Flickr user comments

Authors: Haijun Zhang; Jingxuan Li; Bin Luo; Yan Li

Addresses: Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong Province, China ' Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong Province, China ' Department of Computer Science, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong Province, China ' School of Computer Engineering, Shenzhen Polytechnic, Shenzhen, Guangdong Province, China

Abstract: In the Web2.0 era, user generated content has become the main source of information of many popular photo-sharing websites such as Flickr. In Flickr, many photos have very few or even no tags, because only the uploader can mark tags for a photo. Meanwhile, the user can deliver his/her comment on the photo, which he/she is browsing. Therefore, it is possible to recommend new tags or enrich the existing tag set based on user comments. The work of this paper contains two phases, i.e., the tag generation, and the ranking algorithm. In the phase of candidate tags generation, two methods are introduced relying on natural language processing (NLP) techniques, namely word-based and phrase-based. In ranking and recommending tags, we proposed an algorithm by jointly modelling the location information of candidate tags, statistical information of candidate tags and semantic similarity between candidate tags. Extensive experimental results demonstrate the effectiveness of our method.

Keywords: tag recommendation; user comment; Flickr; image annotation.

DOI: 10.1504/IJITM.2019.099808

International Journal of Information Technology and Management, 2019 Vol.18 No.2/3, pp.297 - 326

Available online: 10 May 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article