1. 07 Jul, 2020 4 commits
  2. 06 Jul, 2020 6 commits
  3. 05 Jul, 2020 3 commits
    • Carlos Garnacho's avatar
      libtracker-miner: Gather as many SPARQL updates as possible for every batch · 760cc6b2
      Carlos Garnacho authored
      We currently block the processing queue if the parent is seen in any stage
      of processing, the situation is unblocked by flushing early, so processing
      can resume after the SPARQL updates were performed.
      
      This may lead to suboptimal buffer occupation, ultimately dependent on
      the filesystem layout.
      
      To improve this situation, rely on blank node labels being stable across
      the whole SPARQL update string, and add a blank node labeling scheme that
      allows files within a same SPARQL batch reference each other through these
      blank node labels instead of IRIs.
      
      This allows maximum buffer occupation regardless of the filesystem layout,
      we still have to wait after a SPARQL update if a file being processed
      references (i.e. child/parent relationship) another file added in the
      SPARQL update being currently done. But that happens once per batch,
      instead of once per folder.
      760cc6b2
    • Carlos Garnacho's avatar
      tracker-miner-fs: Process only one item at a time · f8dffcc8
      Carlos Garnacho authored
      Despite us spawning a number of tasks (10 currently) to process
      files in parallel, disk access is a bottleneck to them. Reduce
      this to 1, it doesn't seem to impact performance that much, and
      it does help in not processing files together that may not go
      in the same SPARQL update batch.
      
      This is irrelevant now, since the operation would be blocked by
      the parent being still processed, but we want to lighten that
      block in future commits.
      f8dffcc8
    • Carlos Garnacho's avatar
      tracker-miner-fs: Preempt graph creation · 08a5cc5b
      Carlos Garnacho authored
      This is a busy operation that we can squeeze in the initial delay,
      or along the startup phase if there was no delay. In initial indexing
      graph creation is a significant chunk of indexing time, so it's
      something we can prepare in advance.
      
      The graph creation operation is still not that *much* I/O heavy, so
      it's unlikely that it'd influence session startup negatively.
      08a5cc5b
  4. 04 Jul, 2020 13 commits
  5. 03 Jul, 2020 6 commits
  6. 29 Jun, 2020 2 commits
  7. 28 Jun, 2020 2 commits
  8. 27 Jun, 2020 4 commits