Title: A distributed architecture to integrate ontological knowledge into information extraction

Authors: Anita Alicante; Massimo Benerecetti; Anna Corazza; Stefano Silvestri

Addresses: Department of Electrical Engineering and Information Technologies, Universita di Napoli Federico II, via Claudio 21, 80125 Napoli, Italy ' Department of Electrical Engineering and Information Technologies, Universita di Napoli Federico II, via Claudio 21, 80125 Napoli, Italy ' Department of Electrical Engineering and Information Technologies, Universita di Napoli Federico II, via Claudio 21, 80125 Napoli, Italy ' Department of Electrical Engineering and Information Technologies, Universita di Napoli Federico II, via Claudio 21, 80125 Napoli, Italy

Abstract: In this work we propose a solution for the problem of the entities and relations extraction from textual documents to build an index for a semantically oriented search engine. The approach we propose is based on the integration of statistical classifiers and ontological constraints through Markov random fields. Owing to the high computational complexity of the approach, the architecture of our system is distributed and exploits parallelisation to lower processing time. In the experimental assessment we show how the proposed system can be effectively applied to a large data set, namely BioNLP-ST 2013. While the experimental results provided in the paper refer to a biomedical application, the approach is very general and can be ported to different domains.

Keywords: support vector machines; SVM; information extraction; graphical models; entity classification; relation extraction; relation classification; knowledge integration; ontological constraints; Markov random fields; distributed architecture; ontology; text documents; indexing; semantically oriented search engines; parallel processing; biomedical applications.

DOI: 10.1504/IJGUC.2016.081011

International Journal of Grid and Utility Computing, 2016 Vol.7 No.4, pp.245 - 256

Received: 07 Feb 2015
Accepted: 29 Aug 2015

Published online: 14 Dec 2016 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article