Title: Two approaches for incorporating linguistic constraints to improve the usability of Telugu dependency parser

Authors: R. Rajeswara Rao; B. Venkata Seshu Kumari

Addresses: Department of CSE, JNTUK-UCEV, Kakinada, India ' Department of CSE, St. Peter's Engineering College, Hyderabad, India

Abstract: Statistical systems with high accuracy are very useful in real-world applications. If these systems can capture basic linguistic information, then the usefulness of these statistical systems improves a lot. This paper is an attempt at incorporating linguistic constraints in statistical dependency parsing. We consider a simple linguistic constraint that a verb should not have multiple subjects or direct objects as its children in the dependency tree. We first describe the importance of this constraint considering machine translation systems which use dependency parser output, as an example application. We then show how the current state-of-the-art dependency parsers violate this constraint. We describe two methods to handle this constraint. We evaluate our methods on the state-of-the-art dependency Telugu parser. Our results show that we can build a statistical parser which handles linguistic constraints and thus be more useful in real-world applications without compromising accuracy.

Keywords: dependency parsing; Telugu language; statistical parsing; linguistic constraints; machine learning; position-based approach; score-based approach; malt parser; Telugu treebank; natural language parsing; usability.

DOI: 10.1504/IJAPR.2016.079049

International Journal of Applied Pattern Recognition, 2016 Vol.3 No.2, pp.135 - 144

Received: 14 Sep 2015
Accepted: 16 Feb 2016

Published online: 10 Sep 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article