Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • T tracker
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 75
    • Issues 75
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 9
    • Merge requests 9
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GNOME
  • tracker
  • Issues
  • #359
Closed
Open
Created Mar 30, 2022 by Carlos Garnacho@carlosgMaintainer5 of 9 tasks completed5/9 tasks

Support deserialization of formats

It would be great to have a:

void tracker_sparql_connection_deserialize_async (TrackerSparqlConnection *conn,
                                                  TrackerSerializeFlags    flags,
                                                  TrackerRdfFormat         format,
                                                  const char              *default_graph,
                                                  GInputStream            *stream,
                                                  GCancellable            *cancellable,
                                                  GAsyncReadyCallback      callback,
                                                  gpointer                 user_data);

gboolean tracker_sparql_connection_deserialize_finish (TrackerSparqlConnection  *connection,
                                                       GAsyncResult             *result,
                                                       GError                  **error);

API in order to complement tracker_sparql_connection_serialize_async(), this call would do the inverse step of reading a GInputStream and incorporating the data into the connection database.

Internally, a TrackerDeserializer could be a subclass of TrackerSparqlCursor taking an input stream, and data be traversed just as if it came directly from a DESCRIBE query. This is nicely symmetrical with serializers that do the inverse step (getting an input stream from a cursor), and may allow e.g. daisy chaining to convert between formats.

In order to unify all format parsing in this infrastructure, and staying as conservative about exposed API as we are with deserializers, we would need an increased interaction between libtracker-data and libtracker-sparql. A possible roadmap I have in mind is:

  • Fold libtracker-fts into libtracker-data, and libtracker-data into libtracker-sparql/core. Avoid the intermediate libtracker-data static library, and the access to be limited to libtracker-sparql public API only. (!504 (merged))
  • Add TrackerDeserializer base class (!506 (merged))
  • Refactor TrackerTurtleReader to be a subclass of it, change the places in libtracker-data (ontology parsing, rdf loading) to use the new API (!506 (merged))
  • Add further subclasses for the remaining TrackerSerializerFormat (!507 (merged))
  • Add and document the API (!516 (merged))

Bonus points:

  • Add internal TrackerDeserializer implementation to convert a tree of TrackerResource into a cursor.
  • Reimplement tracker_resource_print_jsonld() and tracker_resource_print_turtle() over TrackerDeserializer/TrackerSerializer.
  • Implement https://www.w3.org/TR/2013/REC-sparql11-service-description-20130321/ for HTTP endpoints by serializing a fixed TrackerResource+TrackerNamespaceManager into the requested RDF format.
  • Improving CLI for backups, and maybe conversion between formats.

With this API in place, there's all the tools to implement from backup recoveries to data migrations over a network. It seems a nice byproduct if things become more self-contained and testable and we can neatly implement missing bits of the SPARQL spec :).

Edited Jul 01, 2022 by Carlos Garnacho
Assignee
Assign to
Time tracking