Title: Why provenance of SPARQL 1.1 queries

Authors: Anastasia Analyti

Addresses: Institute of Computer Science, FORTH-ICS, Greece

Abstract: In this paper, we study and provide algorithms for source-provenance of answers to extended SPARQL queries. Extended SPARQL queries are an extension of SPARQL 1.1 queries which support not only a single dataset but multiple datasets, each in a particular context. For example, normal subqueries, aggregate subqueries, (NOT) EXISTS filter subqueries may (optionally) have their own dataset. Additionally, GRAPH patterns can query multiple RDF graphs from the local FROM NAMED dataset and not just one. For monotonic queries, the source why provenance sets that we derive for an answer mapping are each the minimal set of sources appearing in the query that if we consider as they are while the rest of the sources are considered empty, we derive the same answer mapping. We show that this property does not hold for non-monotonic queries. Among others, knowing source why provenance is of critical importance for judging confidence on the answer, allow information quality assessment, accountability, as well as understanding the temporal and spatial status of information.

Keywords: extended SPARQL queries; query pattern source why sets; source why provenance; algorithms.

DOI: 10.1504/IJWET.2024.142214

International Journal of Web Engineering and Technology, 2024 Vol.19 No.3, pp.232 - 266

Received: 13 Dec 2023
Accepted: 15 Jun 2024

Published online: 14 Oct 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article