Title: Efficient parallelised search engine based on virtual cluster

Authors: Che Lun Hung; Chun-Yuan Lin

Addresses: Department of Computer Science and Communication Engineering, Providence University, Taichung 433, Taiwan ' Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan 333, Taiwan

Abstract: Recently, more and more researches have indicated that the personalised and parallelised search engine can provide users with fast and correct information from the internet. Hadoop is a software framework to process the huge dataset with more than petabyte size. Virtualisation technology can fully utilise the resources of physical machines. In this paper, we construct a virtual cluster as a Hadoop cluster by multiple virtual machines to perform multiple Nutch simultaneously. From the experimental results, the proposed virtual cluster architecture for Nutch can retrieval data rapidly and the performance enhancement is proportional to the number of virtual machines.

Keywords: cloud computing; Hadoop; MapReduce; virtual machines; parallelised search engines; virtual clusters; data retrieval.

DOI: 10.1504/IJCSE.2016.074557

International Journal of Computational Science and Engineering, 2016 Vol.12 No.1, pp.53 - 57

Received: 14 Mar 2012
Accepted: 15 Mar 2012

Published online: 06 Feb 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article