Title: Text independent root word identification in Hindi language using natural language processing

Authors: Leena Jain; Prateek Agrawal

Addresses: Department of Computer Science Engineering, Punjab Technical University, Kapurthala, Punjab, India ' Department of Computer Science Engineering, Punjab Technical University, Kapurthala, Punjab, India

Abstract: In this paper, an attempt is made to parse Hindi words to identify root word from an inflected word using natural language processing (NLP) technique. Stemming is a heuristic process that chops off the ends of words to find the root word and often includes the removal of derived affixes. It is used to improve retrieval effectiveness and to reduce the size of indexing files. Our proposed work is capable to stem the words which are not prior stored in database. The major application of this work is to learn Hindi language and its grammar in a very interactive manner. Also, it is very useful in building natural language translators in Hindi.

Keywords: pattern matching; suffix stripping; prefix stripping; inflected words; root words; stemming; text independent root word identification; Hindi language; natural language processing; NLP; India.

DOI: 10.1504/IJAIP.2015.073705

International Journal of Advanced Intelligence Paradigms, 2015 Vol.7 No.3/4, pp.240 - 249

Received: 28 Oct 2014
Accepted: 10 Mar 2015

Published online: 16 Dec 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article