Title: Text independent root word identification in Hindi language using natural language processing
Authors: Leena Jain; Prateek Agrawal
Addresses: Department of Computer Science Engineering, Punjab Technical University, Kapurthala, Punjab, India ' Department of Computer Science Engineering, Punjab Technical University, Kapurthala, Punjab, India
Abstract: In this paper, an attempt is made to parse Hindi words to identify root word from an inflected word using natural language processing (NLP) technique. Stemming is a heuristic process that chops off the ends of words to find the root word and often includes the removal of derived affixes. It is used to improve retrieval effectiveness and to reduce the size of indexing files. Our proposed work is capable to stem the words which are not prior stored in database. The major application of this work is to learn Hindi language and its grammar in a very interactive manner. Also, it is very useful in building natural language translators in Hindi.
Keywords: pattern matching; suffix stripping; prefix stripping; inflected words; root words; stemming; text independent root word identification; Hindi language; natural language processing; NLP; India.
DOI: 10.1504/IJAIP.2015.073705
International Journal of Advanced Intelligence Paradigms, 2015 Vol.7 No.3/4, pp.240 - 249
Received: 28 Oct 2014
Accepted: 10 Mar 2015
Published online: 16 Dec 2015 *