Title: Fast parallel PageRank technique for detecting spam web pages

Authors: Nilay Khare; Hema Dubey

Addresses: Department of CSE, Maulana Azad National Institute of Technology, Bhopal, India ' Department of CSE, Maulana Azad National Institute of Technology, Bhopal, India

Abstract: Brin and Larry proposed PageRank in 1998, which appears as a prevailing link analysis technique used by web search engines to rank its search results list. Computation of PageRank values in an efficient and faster manner for very immense web graph is truly an essential concern for search engines today. To identify the spam web pages and also deal with them is yet another important concern in web browsing. In this research article, an efficient and faster parallel PageRank algorithm is proposed, which harnesses the power of graphics processing units (GPUs). In proposed algorithm, the PageRank scores are non-uniformly distributes among the web pages, so it is also competent of coping with spam web pages. The experiments are performed on standard datasets available in Stanford large network dataset collection. There is a speed up of about 1.1 to 1.7 for proposed parallel PageRank algorithm over existing parallel PageRank algorithm.

Keywords: graphics processing unit; GPU; compute unified device architecture; CUDA; parallel PageRank technique; spam web pages.

DOI: 10.1504/IJDMMM.2019.102720

International Journal of Data Mining, Modelling and Management, 2019 Vol.11 No.4, pp.350 - 365

Accepted: 11 Aug 2018
Published online: 02 Oct 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article