High CPU usage of tracker-miner-fs (3.0) in `notify_roots_finished()`
Recently I tried configuring my external HD for indexing, and to my surprise this already works.
After unplugging and replugging the HD I spotted tracker-miner-fs-3
burning a lot of CPU, so decided to investigate.
The folder enqueued for indexing has 3298 subfolders. t-m-fs seems to be doing this:
- miner_handle_next_item()
- -> item_queue_get_next_file()
- -> -> tracker_file_notifier_is_active() -> returns TRUE (there are ~60 dirs in
priv->pending_index_roots
) - -> item_queue_get_next_file() returns FALSE
- -> -> tracker_sparql_buffer_flush() -> priv->tasks is empty
- -> -> notify_roots_finished()
The CPU bottleneck appears to be in notify_roots_finished()
.
(gdb) call g_hash_table_size(fs->priv->roots_to_notify)
$23 = 2245
We iterate through all 2245 items in the hash table, and since check_queues
is TRUE we call tracker_priority_queue_find()
for each one.
(gdb) p (fs->priv->items)->queue->length
$27 = 30134
...so the tracker_priority_queue_find()
call there is likely what's triggering the high CPU usage, iterating a 30k item linked list 2245 times.
The crawl is ongoing, and each call to file_notifier_directory_finished()
grows the roots_to_notify
hash table, so this is a rather pathological case.
This is happening with 3.0.1, I will re-test with master.