Title: Cosine similarity-based PageRank calculation

Authors: S. Poomagal; T. Hamsapriya

Addresses: Department of Computer and Information Sciences, PSG College of Technology, Peelamedu, Coimbatore – 641004, India. ' Department of Information Technology, PSG College of Technology, Peelamedu, Coimbatore – 641004, India

Abstract: This paper introduces a new method for calculating the rank of a web page based on the content similarity and the link structure. There are different ranking algorithms available in the literature to calculate the importance score of web pages. The basis of all ranking algorithms is the link structure of the web. Since links from similar documents are more important than the links from other dissimilar documents, combining content similarity with link structure assigns higher ranks to more relevant documents. Cosine similarity measure is used in this paper for calculating similarity among the documents. The proposed technique is compared with existing ranking algorithms using precision, recall and F-measure.

Keywords: web mining; PageRank calculation; content similarity; link structure; authority score; hub score; web pages; web page ranking; ranking algorithms; cosine similarity measures.

DOI: 10.1504/IJWS.2011.044085

International Journal of Web Science, 2011 Vol.1 No.1/2, pp.142 - 159

Published online: 28 Mar 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article