Authors: Hongzhi Wang, Jianzhong Li, Jizhou Luo
Addresses: Department of Computer Science and Technology, Harbin Institute of Technology, P.O. Box 750, Harbin, China. ' Department of Computer Science and Technology, Harbin Institute of Technology, P.O. Box 750, Harbin, China. ' Department of Computer Science and Technology, Harbin Institute of Technology, P.O. Box 750, Harbin, China
Abstract: In the information integration system, XML becomes an important format for information representation and exchanging. Selection of useful data sources for a query is a crucial problem for efficient query processing in an information integration system. This paper focuses on the data sources selection for XML data sources in the information integration system. For a query with both structural and value constraints, two kinds of indices, constraint index and structural index are presented for data sources selection. The former is grouped by values and captures the structure related to each value in a group. The latter is to summarise all the paths in the XML data sources. In order to reduce the size of index, index compacting and node selection strategies are presented. Based on the structure, efficient data sources selection methods are designed. Extensive experiments are performed to demonstrate the efficiency and effectiveness of the structure and data sources selection strategies presented in this paper.
Keywords: XML; data sources selection; information integration; information representation; information exchange; query processing; constraint index; structural index.
International Journal of Intelligent Information and Database Systems, 2008 Vol.2 No.4, pp.422 - 445
Published online: 27 Nov 2008 *Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article