Title: Scalable information extraction for web queries

Authors: Meichun Hsu, Yuhong Xiong

Addresses: Hewlett Packard Laboratories, 1501 Page Mill Road, Palo Alto, CA 94022, USA. ' Innovation Works, F16 Tower C, Tsinghua Science Park, Haidian District, Beijing 100084, China

Abstract: The dominant way to find information on the web nowadays is through search. General search engines are very effective, but search phrases and results are unstructured and that limits a user|s ability to further automate the processing of the search results. In recent years, we have seen efforts to build systems that support more precise query on the web for certain content verticals. We describe the general problems for building an extensible web query system and present one of our projects in this area – a vertical search portal for online courses.

Keywords: web mining; parallel computing; classification; information extraction; focused crawling; web queries; information retrieval; web search; vertical search portals; online courses.

DOI: 10.1504/IJCSE.2010.037673

International Journal of Computational Science and Engineering, 2010 Vol.5 No.3/4, pp.176 - 184

Published online: 24 Dec 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article