Title: A generic and high-performance RDF instance generator

Authors: Tanguy Raynaud; Samir Amir; Rafiqul Haque

Addresses: LIRIS, Computer Science Department, Claude Bernard University Lyon 1, Villeurbanne, France ' LIRIS, Computer Science Department, Claude Bernard University Lyon 1, Villeurbanne, France ' LIRIS, Computer Science Department, Claude Bernard University Lyon 1, Villeurbanne, France

Abstract: This paper presents a design and implementation of a novel OWL-based RDF instance generator. Technologies are very often experimented to verify their behaviour or suitability for a certain usage. The experimenters require data for conducting experiments. The real-world data may not be easily available or accessible, in many cases. Therefore, synthetic data are used. There are solutions for generating datasets which consist of RDF triples. However, they are locked-in to specific ontologies. This, in our view, is a critical limitation, since it is not a flexible approach. In this paper, we present a generic RDF data generator called GAIA which allows users to generate RDF triples by conforming to any ontology. GAIA is built on an in-memory architecture and it relies on parallelisation techniques which guarantee high-performance. The results of experiments show that GAIA performs reasonably well with large-scale ontologies. In addition, it can handle large-scale ontologies such as NCBI.

Keywords: semantic web; optimisation; big data; ontology; RDF triples; RDF instance generation; resource description framework; OWL; parallelisation.

DOI: 10.1504/IJWET.2016.077342

International Journal of Web Engineering and Technology, 2016 Vol.11 No.2, pp.133 - 152

Published online: 28 Jun 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article