Optimizations to tracker-miner-fs-3 file handling
(This branch depends on !370 (merged))
With a scheme in place for stable URNs, we can apply some optimizations to the way we process files:
-
Since we can know the URNs without querying the database after insertion, we can avoid blocking for a TrackerBatch to complete before continuing with file processing. In order to avoid piling things up in memory while they are flushed, we still keep just up to 2 batches (the one we are executing, and the next one we are building up).
-
In order to lower the peak memory usage, involve TrackerFileNotifier in this flow control, so it progressively feeds files to be processed, instead of racing across them unstopped. This has the nice effect to keep memory usage stable across big filesystem layouts.
-
Since the peak memory is lower and stable, we can afford to increase the batch size.
The end result is both faster and has less memory usage fluctuations while tracker-miner-fs-3 is working across a set of files.
Freshly indexing a set of 111735 elements (incl. 18222 folders), with this branch:
--------------------------------------------------------------------------------
Command: /home/carlos/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n
Massif arguments: (none)
ms_print arguments: massif.out.1313319
--------------------------------------------------------------------------------
MB
67.73^ #
| @: : :@:::#::
| @ @@:@::::::@@:::#::
| ::: ::@@:@@:@::::::@@:::#::
| @: :: ::: :::@@:@@:@::::::@@:::#::
| : : @: @::::@::::::::: :::@@:@@:@::::::@@:::#::
| :::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| :::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| :::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| ::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| : @@:::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| ::::: @ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| ::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
| @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
0 +----------------------------------------------------------------------->Gi
0 164.0
$ time ~/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n >/dev/null
real 0m46,519s
user 1m10,400s
sys 0m5,623s
Without this branch:
--------------------------------------------------------------------------------
Command: /home/carlos/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n
Massif arguments: (none)
ms_print arguments: massif.out.1312258
--------------------------------------------------------------------------------
MB
131.7^ #
| #::::@@:@@::
| ::#:: :@ :@ ::::@@
| @@: #:: :@ :@ ::: @ ::::
| @@@@: #:: :@ :@ ::: @ ::: ::::
| :@ @@: #:: :@ :@ ::: @ ::: :::::::
| :@ @@: #:: :@ :@ ::: @ ::: :::::: ::::
| ::::@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :::@
| @: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @:::::
| ::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::
| :::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| @@::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| @ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| @@ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| ::@@ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| : :: @@ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| ::::: @@ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| :: :: @@ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
| :: :: @@ ::::@: :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
0 +----------------------------------------------------------------------->Gi
0 168.0
$ time ~/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n >/dev/null
real 1m14,147s
user 1m5,653s
sys 0m6,468s
This accumulates with !371 (merged) down to the ~40MB of peak memory and ~35s in first index when fanotify is enabled.
P.S.: This is the last thing I've got in queue, I promise