Title: Search engine indexing storage optimisation using Hamming distance

Authors: Anirban Kundu; Siddhartha Sett; Subhajit Kumar; Shruti Sengupta; Srayan Chaudhury

Addresses: Netaji Subhash Engineering College, West Bengal University of Technology, Calcutta 700152, India; Innovation Research Lab (IRL), Capex Technologies, West Bengal 711103, India. ' Netaji Subhash Engineering College, West Bengal University of Technology, Calcutta 700152, India; Innovation Research Lab (IRL), Capex Technologies, West Bengal 711103, India. ' Netaji Subhash Engineering College, West Bengal University of Technology, Calcutta 700152, India; Innovation Research Lab (IRL), Capex Technologies, West Bengal 711103, India. ' Netaji Subhash Engineering College, West Bengal University of Technology, Calcutta 700152, India; Innovation Research Lab (IRL), Capex Technologies, West Bengal 711103, India. ' Netaji Subhash Engineering College, West Bengal University of Technology, Calcutta 700152, India; Innovation Research Lab (IRL), Capex Technologies, West Bengal 711103, India

Abstract: We are going to propose indexing algorithm of search engine aiming to decrease time and space complexity. Existing indexing algorithms have greater space requirements due to the fact that all the words of the web pages are being stored except the stop words. In this paper, we present a theory on indexing mechanism of a search engine. Time complexity is the time taken by the search engine to retrieve information and space complexity is the space required to store the indices in the hard disk. Decreasing the time complexity will lead to faster retrieval of information and decreasing the space complexity leads to efficient utilisation of space. We have only dealt with textual part of the web pages. Hamming distance concept frames approach to achieve better result in space complexity.

Keywords: search engines; forward indexing; inverted indexing; Hamming distance; indexing storage minimisation; indexing storage optimisation; information retrieval; space complexity.

DOI: 10.1504/IJIIDS.2012.045845

International Journal of Intelligent Information and Database Systems, 2012 Vol.6 No.2, pp.113 - 128

Received: 15 Jun 2010
Accepted: 30 Jan 2011

Published online: 16 Aug 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article