Title: Genetic algorithm-based intelligent multiagent architecture for extracting information from hidden web databases

Authors: D. Weslin; T. Joshva Devadas

Addresses: Department of Computer Science, Research and Development Centre, Bharathiar University, Coimbatore, Tamilnadu, 641046, India ' School of Computer Science and Engineering, Vellore Institute of Technology (VIT), Vellore 632014, Tamil Nadu, India

Abstract: Though there are enormous amount of information available in the web, only very small portion of the available information is visible to the users. Due to the non-visibility of huge information, the traditional search engines cannot index or access all information present in the web. The main challenge in the mining of the relevant information from a huge hidden web database is to identify the entry points to access the hidden web databases. The existing web crawlers cannot retrieve all information from the hidden web databases. To retrieve all the relevant information from the hidden web, this paper proposes an architecture that uses genetic algorithm and intelligent agents for accessing hidden web databases. The proposed architecture is termed as genetic algorithm-based intelligent multi-agent system (GABIAS). The experimental results show that the proposed architecture provides better precision and recall than the existing web crawlers.

Keywords: genetic algorithm; GA; hidden web; intelligent agent; web crawler.

DOI: 10.1504/IJBIDM.2018.10008837

International Journal of Business Intelligence and Data Mining, 2020 Vol.16 No.2, pp.204 - 213

Received: 28 Jun 2017
Accepted: 02 Sep 2017

Published online: 30 Jan 2020 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article