Title: Why provenance of SPARQL 1.1 queries
Authors: Anastasia Analyti
Addresses: Institute of Computer Science, FORTH-ICS, Greece
Abstract: In this paper, we study and provide algorithms for source-provenance of answers to extended SPARQL queries. Extended SPARQL queries are an extension of SPARQL 1.1 queries which support not only a single dataset but multiple datasets, each in a particular context. For example, normal subqueries, aggregate subqueries, (NOT) EXISTS filter subqueries may (optionally) have their own dataset. Additionally, GRAPH patterns can query multiple RDF graphs from the local FROM NAMED dataset and not just one. For monotonic queries, the source why provenance sets that we derive for an answer mapping are each the minimal set of sources appearing in the query that if we consider as they are while the rest of the sources are considered empty, we derive the same answer mapping. We show that this property does not hold for non-monotonic queries. Among others, knowing source why provenance is of critical importance for judging confidence on the answer, allow information quality assessment, accountability, as well as understanding the temporal and spatial status of information.
Keywords: extended SPARQL queries; query pattern source why sets; source why provenance; algorithms.
DOI: 10.1504/IJWET.2024.142214
International Journal of Web Engineering and Technology, 2024 Vol.19 No.3, pp.232 - 266
Received: 13 Dec 2023
Accepted: 15 Jun 2024
Published online: 14 Oct 2024 *