libtracker-sparql/direct: Use TrackerBatch beneath update_array_async (!388) · Merge requests · GNOME / tracker

Carlos Garnacho requested to merge wip/carlosg/update-array-over-batches into master Mar 30, 2021

When using update_array_async(), we attempt to process the entire set of updates as a transaction, that involves dealing with it as a single SPARQL string (i.e. concatenated in a separate copy in memory).

Since this single huge string may be duplicated for other purposes in libtracker-data internals (e.g. unescaping \u and \U sequences), the impact of dealing with it as a single string can get worse.

Use TrackerBatch underneath instead, this means queries are treated individually for parsing purposes, these possible string duplications happen over these smaller chunks, and memory does not peak as much with large sets of updates.

Locally, this reduced the peak heap usage from 95MB to 85MB in tracker-miner-fs-3 when dealing with document metadata coming from tracker-extract-3.

libtracker-sparql/direct: Use TrackerBatch beneath update_array_async

Merge request reports