Skip to content

Fully implement sparql1.1 query/update recommendations

Carlos Garnacho requested to merge wip/carlosg/sparql1.1 into master

This branch fully (to a known extent) implements the query and update syntax described at https://www.w3.org/TR/sparql11-query/ and https://www.w3.org/TR/sparql11-update/. The big features are:

  • Full as-spec graph semantics
    • Support for named graphs
    • FROM / FROM NAMED / USING / GRAPH and other graph specifiers are correctly honored
    • Support for a default graph, currently the union graph of the unnamed graph and all named graphs.
    • Support for CREATE/DROP/ADD/MOVE/COPY/CLEAR
  • Implemented missing builtin functions
    • TZ/TIMEZONE
    • STRUUID/UUID/BNODE
    • LANGMATCHES/STRLANG
    • URI/IRI
    • SAMPLE
    • STRDT, isLiteral, isBlank, ...
  • Implemented SERVICE {} syntax to access remote SPARQL endpoints
  • Implemented DESCRIBE syntax to extract portions of the stored data
  • Implemented CONSTRUCT syntax to build arbitrary RDF triples out of stored data
  • Implemented LOAD to load RDF from external resources
  • Implemented VALUES syntax to provide specific tuples
  • rdf:langString is now allowed as a basic type
  • BASE is honored now
  • time data now has microsecond precision

This involved the following changes to our code:

  • The journal was dropped. Perhaps untimely since we'll maybe lose user data on the transition, but as tracker is mostly a cache we shouldn't lose transcendental data (worst I can think is gnome-music playlists...)
  • The database format changed considerably, and database version was thus bumped.
    • Graphs are now handled as separate databases we attach. All databases have the same on-disk format, with the exception of meta.db (which serves for the unnamed graph) that has some additional accounting tables
    • Temporary views are created to union all databases, and serve as the "default graph"
    • Date/time format changed on disk, now it's a single column that can be either a unix timestamp or a iso8601 datetime string
    • Langstring is stored as BLOBs with "text\0encoding" format
    • We now store in meta.db some metadata we stored in separate files in ~/.cache/tracker, in an attempt to make the database more ubiquitous.
    • The SERVICES{} sparql syntax is implemented through a virtual table

As unfortunate as database version bumps are, this proved the best way to handle some of the syntax we were missing. Over the development and testing of this branch I've grown fairly confident of them, there's some things we could be more anal (eg. blank node semantics), but all those could be addressed at the query generation level.

Merge request reports