Title: Tuning up FOIL for extracting information from the web

Authors: Pablo Palacios, Inaki Fernandez de Viana

Addresses: Universidad de Huelva, Department of de Tecnologias de la Informacion, Carretera Huelva – La Rabida, 21071 Palos de la Frontera, Spain. ' Universidad de Huelva, Department of de Tecnologias de la Informacion, Carretera Huelva – La Rabida, 21071 Palos de la Frontera, Spain

Abstract: Most websites are designed to be easily understood by human users. This constitutes a problem when you want to access this information automatically. To resolve this problem, different algorithms have emerged to automatically generate and extract information. One of these algorithms is SRV; SRV uses a technique of supervised learning that is expensive in time. In this paper we present various optimisations to reduce this cost by up 40%.

Keywords: information extraction; wrappers; FOIL; inductive programming; information retrieval; supervised learning; web retrieval.

DOI: 10.1504/IJCAT.2008.022423

International Journal of Computer Applications in Technology, 2008 Vol.33 No.4, pp.280 - 284

Published online: 04 Jan 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article