Int. J. of Granular Computing, Rough Sets and Intelligent Systems   »   2009 Vol.1, No.2

 

 

Title: Hyper-textual language model for web information retrieval

 

Author: Ying Xie, Vijay V. Raghavan, Andrew Young

 

Addresses:
Department of Computer Science and Information Systems, Kennesaw State University, Kennesaw, GA 30144, USA.
The Centre for Advanced Computer Studies, University of Louisiana at Lafayette, Lafayette, LA 70503, USA.
4Access Communications Cumming, GA 30040, USA

 

Abstract: In this paper, we propose a unified retrieval model that is called the hyper-textual language model for web information retrieval. The proposed model seamlessly integrates information from multiple sources including web content, hyperlinks and the topology of the web in a unified modelling framework. On the one hand, this model extends the language modelling technique to accommodate special structural and semantic information brought by the hyperlinks of the web; on the other hand, it provides a formal retrieval model that realises topic-relevant pageranking. Experimental study on a university website shows that this formal retrieval model outperforms several alternative search techniques including Google and Inktomi on a group of test queries.

 

Keywords: granular computing; hyper-textual language models; information retrieval; language models; page ranking; web search; web information; internet; hypertext; web content; hyperlinks; web topology;unified modelling; semantics; page ranking; webpages; university websites.

 

DOI: 10.1504/IJGCRSIS.2009.028009

 

Int. J. of Granular Computing, Rough Sets and Intelligent Systems, 2009 Vol.1, No.2, pp.190 - 202

 

Submission date: 01 Oct 2008
Date of acceptance: 05 Jan 2009
Available online: 27 Aug 2009

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article