Title: An efficient semantic clustering of URLs for web page recommendation

Authors: Sanjeev Kumar Sharma; Ugrasen Suman

Addresses: Faculty of Engineering Sciences, School of Computer Science and Information Technology, Devi Ahilya University, Takshashila Campus, Khandwa Road, Indore (M.P), India ' Faculty of Engineering Sciences, School of Computer Science and Information Technology, Devi Ahilya University, Takshashila Campus, Khandwa Road, Indore (M.P), India

Abstract: Document clustering is a process of text-mining in which documents with similar contents are considered in one cluster while dissimilar documents are considered in other cluster. The number of texts and hypertext documents are growing quickly due to growing speed of WWW and it has become a very challenging task to discover the truly relevant content for some user or purpose due to the huge size, high dynamics and large diversity of the web. There are several web browsers which use web pages to retrieve information in the form of image, audio, video, text through URLs. There are some URLs, which are used frequently by web users. In this paper, an efficient semantic clustering (ESC) algorithm is proposed in which the number of URLs are clustered together to find larger clusters of most frequent URLs. The ESC algorithm is experimented on two large datasets for semantic clustering. The proposed approach will be useful to recommend most appropriate and relevant URLs to the web users according to their query.

Keywords: similarity; URLs; recommender systems; efficient semantic clustering; ESC; web page recommendations; recommendation systems; URL clustering; information retrieval.

DOI: 10.1504/IJDATS.2013.058578

International Journal of Data Analysis Techniques and Strategies, 2013 Vol.5 No.4, pp.339 - 358

Published online: 28 Feb 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article