Inderscience PublishersInderscience PublishersInderscience Publishers
  PUBLISHERS OF DISTINGUISHED ACADEMIC, SCIENTIFIC AND PROFESSIONAL JOURNALS

Article Abstract

Title: Some linguistic methods of improving the quality of document retrieval on the internet
  Author: Alexander Gelbukh, Grigori Sidorov, Yoel Ledo-Mezquita   Email author(s)
  Address: Natural Language and Text Processing Laboratory, Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Batiz, s/n, Zacatenco 07738, Mexico City, Mexico. ' Natural Language and Text Processing Laboratory, Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Batiz, s/n, Zacatenco 07738, Mexico City, Mexico. ' Natural Language and Text Processing Laboratory, Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Batiz, s/n, Zacatenco 07738, Mexico City, Mexico
  Journal: International Journal of Electronic Business 2005 - Vol. 3, No.3/4  pp. 264 - 275
  Abstract: One of the problems of e-business is to find relevant documents for making correct decisions. The main problem of the Internet is the huge amount of documents, which makes it difficult to find the relevant ones, hence the importance of the methods allowing for improving the quality of document retrieval. We discuss some linguistic problems of document retrieval on the internet related to the following natural language phenomena: (1) morphological processes: e.g., takes, took, taken are grammar forms of take; (2) polysemy and homonymy: most words have several senses, e.g., bank is a financial institution, shore, bench, etc.; (3) non-linearity of syntactic relations: in the case of a query that contains word combinations, the words forming a word combination can be separated by other words in the documents. Some linguistic-based methods and strategies related to the discussed problems are proposed that improve the quality of document retrieval or show the necessity of application of linguistic methods.
  Keywords: linguistic methods; document retrieval; internet; morphological processing; polysemy; homonymy; word combinations; information retrieval; e-business; electronic business; retrieval quality; natural language; syntactic relations.
  DOI: 10.1504/IJEB.2005.007271
  Access for editors and complimentary subscribers       Access for Subscribers   Purchase this Paper        We welcome your comments about this paper Comment on the Paper