Title: Ranking images in web documents based on HTML TAGs for image retrieval from WWW

Authors: P. Shanmuga Vadivu; P. Sumathy; A. Vadivel

Addresses: Department of Computer Science and Applications, Gandhigram Rural Institute, Deemed University, Dindigul District, Tamil Nadu 624302, India ' Department of Computer Science and Applications, Gandhigram Rural Institute, Deemed University, Dindigul District, Tamil Nadu 624302, India ' Department of Computer Applications, National Institute of Technology, Trichy, 620015, India

Abstract: Large number of images are embedded into web pages and it is difficult to map the semantics of the images using available text in documents. Retrieval systems are designed for ranking and retrieving images using various ranking mechanisms. These ranking mechanisms use the text present in the HTML document and this alone may not be sufficient for improving precision of retrieval. In this paper, the text present in the <IMG> TAG is analysed and each attribute in the TAG is categorised into four levels. A suitable weight is assigned to the attribute values of different levels such that the importance of each level is considered. The top level attributes are assigned higher weights and lower weight is assigned to the lowest level attributes. We have compared the performance with the Google image search system and observed that the performance of the proposed approach is encouraging.

Keywords: HTML TAG; image ranking; content based image retrieval; CBIR; text based image retrieval; CBIR; web pages; embedded images; semantics; Google image search.

DOI: 10.1504/IJCISTUDIES.2014.062730

International Journal of Computational Intelligence Studies, 2014 Vol.3 No.2/3, pp.176 - 195

Received: 23 Jan 2013
Accepted: 10 Jun 2013

Published online: 28 Jun 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article