Skip to content

libtracker-sparql/direct: Use TrackerBatch beneath update_array_async

Carlos Garnacho requested to merge wip/carlosg/update-array-over-batches into master

When using update_array_async(), we attempt to process the entire set of updates as a transaction, that involves dealing with it as a single SPARQL string (i.e. concatenated in a separate copy in memory).

Since this single huge string may be duplicated for other purposes in libtracker-data internals (e.g. unescaping \u and \U sequences), the impact of dealing with it as a single string can get worse.

Use TrackerBatch underneath instead, this means queries are treated individually for parsing purposes, these possible string duplications happen over these smaller chunks, and memory does not peak as much with large sets of updates.

Locally, this reduced the peak heap usage from 95MB to 85MB in tracker-miner-fs-3 when dealing with document metadata coming from tracker-extract-3.

Merge request reports