Int. J. of Web Engineering and Technology   »   2004 Vol.1, No.3

 

 

Title: Web task automation: a standards-based proposal

 

Author: Vicente Luque Centeno, Carlos Delgado Kloos, Luis Sanchez Fernandez, Norberto Fernandez Garcia

 

Addresses:
Department of Telematics Engineering, Universidad Carlos III de Madrid, 28911 Leganes (Madrid), Spain.
Department of Telematics Engineering, Universidad Carlos III de Madrid, 28911 Leganes (Madrid), Spain.
Department of Telematics Engineering, Universidad Carlos III de Madrid, 28911 Leganes (Madrid), Spain.
Department of Telematics Engineering, Universidad Carlos III de Madrid, 28911 Leganes (Madrid), Spain

 

Abstract: Tasks on the web are performed world-wide for many different purposes (banking, shopping, auctions, e-mail, hotel reservations, flight booking, etc.). Up to now, using typical HTML-based web browsers for the web required users to mechanically and continually interact with computer screen-view of remotely retrieved documents (clicking on links or buttons, filling and submitting forms, screen-scrolling, visually finding data on the screen, to name a few). When the amount of data within those documents is large, this manual navigation easily becomes cost and effort overwhelming, even for the simplest tasks. Developing ad-hoc wrapper agents that automate these tasks for the user, by intelligently integrating semistructured web's data from heterogeneous sources, may considerably reduce these interactivity and effort requirements. Bargain finders or price comparers, among others, might present only final valuable results to the users, considerably reducing navigation effort. However, ad-hoc wrapper agents have traditionally had large development and maintenance costs. Due to the semistructured nature of HTML, any minor unexpected change often makes them not work properly. This paper presents several standards-based new techniques for reducing these development and maintenance costs and making these programs more compact and stable.

 

Keywords: web wrapper agent; screen scrapping; web data integration; semistructured information automation; web tasks; web mediator; information retrieval; software maintenance; XPath; Message Sequence Charts.

 

DOI: 10.1504/IJWET.2004.005239

 

Int. J. of Web Engineering and Technology, 2004 Vol.1, No.3, pp.374 - 391

 

Available online: 14 Sep 2004

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article