Title: Efficient parallelised search engine based on virtual cluster
Authors: Che Lun Hung; Chun-Yuan Lin
Addresses: Department of Computer Science and Communication Engineering, Providence University, Taichung 433, Taiwan ' Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan 333, Taiwan
Abstract: Recently, more and more researches have indicated that the personalised and parallelised search engine can provide users with fast and correct information from the internet. Hadoop is a software framework to process the huge dataset with more than petabyte size. Virtualisation technology can fully utilise the resources of physical machines. In this paper, we construct a virtual cluster as a Hadoop cluster by multiple virtual machines to perform multiple Nutch simultaneously. From the experimental results, the proposed virtual cluster architecture for Nutch can retrieval data rapidly and the performance enhancement is proportional to the number of virtual machines.
Keywords: cloud computing; Hadoop; MapReduce; virtual machines; parallelised search engines; virtual clusters; data retrieval.
DOI: 10.1504/IJCSE.2016.074557
International Journal of Computational Science and Engineering, 2016 Vol.12 No.1, pp.53 - 57
Received: 14 Mar 2012
Accepted: 15 Mar 2012
Published online: 06 Feb 2016 *