Title: Warehousing complex data from the web

Authors: O. Boussaid, J. Darmont, F. Bentayeb, S. Loudcher

Addresses: ERIC, University of Lyon 2, 5 Avenue Pierre Mendes-France, 69676 Bron Cedex, France. ' ERIC, University of Lyon 2, 5 Avenue Pierre Mendes-France, 69676 Bron Cedex, France. ' ERIC, University of Lyon 2, 5 Avenue Pierre Mendes-France, 69676 Bron Cedex, France. ' ERIC, University of Lyon 2, 5 Avenue Pierre Mendes-France, 69676 Bron Cedex, France

Abstract: Data warehousing and Online Analytical Processing (OLAP) technologies are now moving onto handling complex data that mostly originate from the web. However, integrating such data into a decision-support process requires their representation in a form processable by OLAP and/or data mining techniques. We present in this paper a complex data warehousing methodology that exploits eXtensible Markup Language (XML) as a pivot language. Our approach includes the integration of complex data in an ODS, in the form of XML documents; their dimensional modelling and storage in an XML data warehouse; and their analysis with combined OLAP and data mining techniques. We also address the crucial issue of performance in XML warehouses.

Keywords: data warehousing; online analytical processing; OLAP; decision support systems; DSS; complex data; web data; ETL process; XML warehousing; XML cube; X-warehousing; data mining.

DOI: 10.1504/IJWET.2008.019942

International Journal of Web Engineering and Technology, 2008 Vol.4 No.4, pp.408 - 433

Published online: 17 Aug 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article