Skip to content

Optimizations to tracker-miner-fs-3 file handling

Carlos Garnacho requested to merge wip/carlosg/miner-files-queues into master

(This branch depends on !370 (merged))

With a scheme in place for stable URNs, we can apply some optimizations to the way we process files:

  • Since we can know the URNs without querying the database after insertion, we can avoid blocking for a TrackerBatch to complete before continuing with file processing. In order to avoid piling things up in memory while they are flushed, we still keep just up to 2 batches (the one we are executing, and the next one we are building up).

  • In order to lower the peak memory usage, involve TrackerFileNotifier in this flow control, so it progressively feeds files to be processed, instead of racing across them unstopped. This has the nice effect to keep memory usage stable across big filesystem layouts.

  • Since the peak memory is lower and stable, we can afford to increase the batch size.

The end result is both faster and has less memory usage fluctuations while tracker-miner-fs-3 is working across a set of files.

Freshly indexing a set of 111735 elements (incl. 18222 folders), with this branch:

--------------------------------------------------------------------------------
Command:            /home/carlos/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n
Massif arguments:   (none)
ms_print arguments: massif.out.1313319
--------------------------------------------------------------------------------


    MB
67.73^                                                                     #  
     |                                                         @:   : :@:::#::
     |                                                    @ @@:@::::::@@:::#::
     |                                             ::: ::@@:@@:@::::::@@:::#::
     |                                     @:  :: ::: :::@@:@@:@::::::@@:::#::
     |                   :     :   @: @::::@::::::::: :::@@:@@:@::::::@@:::#::
     |              :::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |              :::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |            :::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |           ::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |          :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |          :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |  :     @@:::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |  ::::: @ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     |  ::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     | @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     | @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     | @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     | @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
     | @::::::@ :::::::::::::::::::@::@::::@::::::::: :::@@:@@:@::::::@@:::#::
   0 +----------------------------------------------------------------------->Gi
     0                                                                   164.0
$ time ~/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n >/dev/null

real	0m46,519s
user	1m10,400s
sys	0m5,623s

Without this branch:

--------------------------------------------------------------------------------
Command:            /home/carlos/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n
Massif arguments:   (none)
ms_print arguments: massif.out.1312258
--------------------------------------------------------------------------------


    MB
131.7^                         #                                              
     |                         #::::@@:@@::                                   
     |                       ::#:: :@ :@ ::::@@                               
     |                     @@: #:: :@ :@ ::: @ ::::                           
     |                   @@@@: #:: :@ :@ ::: @ ::: ::::                       
     |                  :@ @@: #:: :@ :@ ::: @ ::: :::::::                    
     |                  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::::                
     |               ::::@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :::@            
     |              @:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @:::::       
     |            ::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::     
     |           :::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     |          ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     |        @@::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     |        @ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     |       @@ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     |     ::@@ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     | :  :: @@ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     | ::::: @@ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     | :: :: @@ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
     | :: :: @@ ::::@:  :@ @@: #:: :@ :@ ::: @ ::: :::::: ::: :: @: :: ::::::@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   168.0
$ time ~/Build/gnome/libexec/tracker-miner-fs-3 -s 0 -n >/dev/null

real	1m14,147s
user	1m5,653s
sys	0m6,468s

This accumulates with !371 (merged) down to the ~40MB of peak memory and ~35s in first index when fanotify is enabled.

P.S.: This is the last thing I've got in queue, I promise 😅

Merge request reports