Support deserialization of formats
It would be great to have a:
void tracker_sparql_connection_deserialize_async (TrackerSparqlConnection *conn, TrackerSerializeFlags flags, TrackerRdfFormat format, const char *default_graph, GInputStream *stream, GCancellable *cancellable, GAsyncReadyCallback callback, gpointer user_data); gboolean tracker_sparql_connection_deserialize_finish (TrackerSparqlConnection *connection, GAsyncResult *result, GError **error);
API in order to complement
tracker_sparql_connection_serialize_async(), this call would do the inverse step of reading a GInputStream and incorporating the data into the connection database.
TrackerDeserializer could be a subclass of
TrackerSparqlCursor taking an input stream, and data be traversed just as if it came directly from a DESCRIBE query. This is nicely symmetrical with serializers that do the inverse step (getting an input stream from a cursor), and may allow e.g. daisy chaining to convert between formats.
In order to unify all format parsing in this infrastructure, and staying as conservative about exposed API as we are with deserializers, we would need an increased interaction between libtracker-data and libtracker-sparql. A possible roadmap I have in mind is:
Fold libtracker-fts into libtracker-data, and libtracker-data into libtracker-sparql/core. Avoid the intermediate libtracker-data static library, and the access to be limited to libtracker-sparql public API only. (!504 (merged))
Add TrackerDeserializer base class (!506 (merged))
Refactor TrackerTurtleReader to be a subclass of it, change the places in libtracker-data (ontology parsing, rdf loading) to use the new API (!506 (merged))
Add further subclasses for the remaining
Add and document the API (!516 (merged))
Add internal TrackerDeserializer implementation to convert a tree of
TrackerResourceinto a cursor. (!522 (merged))
tracker_resource_print_turtle()over TrackerDeserializer/TrackerSerializer. (!522 (merged))
Implement https://www.w3.org/TR/2013/REC-sparql11-service-description-20130321/ for HTTP endpoints by serializing a fixed TrackerResource+TrackerNamespaceManager into the requested RDF format. (!522 (merged))
- Improving CLI for backups, and maybe conversion between formats.
With this API in place, there's all the tools to implement from backup recoveries to data migrations over a network. It seems a nice byproduct if things become more self-contained and testable and we can neatly implement missing bits of the SPARQL spec :).