Authors: Ali Harb, Michel Beigbeder, Kristine Lund, Jean-Jacques Girardot
Addresses: Ecole National Superieure des Mines de Saint-Etienne, Laboratory for Information Sciences and Technology (LIST), 42023, Saint Etienne, France. ' Ecole National Superieure des Mines de Saint-Etienne, Laboratory for Information Sciences and Technology (LIST), 42023, Saint Etienne, France. ' ICAR, CNRS, University of Lyon, ENS-LSH, 15 Parvis Rene Descartes, BP 7000, 69342, Lyon, France. ' Ecole National Superieure des Mines de Saint-Etienne, Laboratory for Information Sciences and Technology (LIST), 42023, Saint Etienne, France
Abstract: Most question and answering systems are based on three research themes: question classification and analysis, document retrieval and answer extraction. The performance in every stage affects the final result. To respond correctly to a question given a large collection of textual data is not an easy task. There is a need to perceive and recognise the question at a level that permits to detect some constraints that the question imposes on possible answers. The classification of questions appears as an important task because it deduces the type of expected answers. The purpose is to provide additional information to reduce the gap between answer and question. A method to improve the performance of question classification focusing on linguistic analysis and statistical approaches is presented. This work also proposes two methods of questions expansion. Various questions representation, term weighting and diverse machine learning algorithms are studied. Experiments conducted on actual data are presented. Of interest is the improvement in the precision on the classification of questions.
Keywords: question classification; feature selection; semantic expansion; mutual information; machine learning; text mining; question and answer systems; Q&A systems; linguistic analysis; statistics; query classification.
International Journal of Internet Technology and Secured Transactions, 2011 Vol.3 No.2, pp.134 - 148
Available online: 19 Apr 2011 *Full-text access for editors Access for subscribers Purchase this article Comment on this article