1. 26 May, 2022 1 commit
    • Carlos Garnacho's avatar
      tracker-miner-fs: Always delete graph nie:InformationElement on create/update · 96143ae5
      Carlos Garnacho authored
      There are some situations where the file monitors cannot distinguish between
      a file being created where none existed before, or a file newly created
      replacing a previously existing file.
      
      Treat all create/update events the same WRT trimming the previously
      existing nie:InformationElements, in order to ensure these updates that
      pass as creates also result in the file being reindexed by the
      metadata extractor. This only applies to files that would have metadata
      extracted.
      
      While at it, simplify the SPARQL and move the code so that it is not
      scattered across the function.
      96143ae5
  2. 30 Dec, 2021 2 commits
    • Carlos Garnacho's avatar
      libtracker-miner: Change function arguments · b74127dc
      Carlos Garnacho authored
      We no longer need those many details for
      tracker_miner_fs_get_identifier(), so remove these, and make it
      return a const string.
      b74127dc
    • Carlos Garnacho's avatar
      tracker-miner-fs: Raise batch size · 39b01fa3
      Carlos Garnacho authored
      Now that crawling is throttled by the amount of items left to process
      and the main culprits of high memory usage on large filesystems are
      gone, we can raise the batch size a bit. We can definitely afford a
      couple extra megabytes in memory now, so raise the batch size to also
      optimize the throughput.
      39b01fa3
  3. 29 Dec, 2021 3 commits
  4. 19 Dec, 2021 1 commit
    • Pekka Vuorela's avatar
      Add indexing roots to content specific graphs for availability info · 179bbb80
      Pekka Vuorela authored and Carlos Garnacho's avatar Carlos Garnacho committed
      With external storage devices indexed but then unmounted, the graphs
      will list content that is known but currently not available.
      So far filtering those out was needing access to tracker:FileSystem
      to check tracker:available on the data source.
      
      Adding here the indexing roots to also the content specific graphs.
      Though to note only on newly indexed ones. The external storage devices
      might be rare enough configuration to handle where it's needed.
      
      Related: tracker#304
      179bbb80
  5. 05 Dec, 2021 1 commit
  6. 29 Oct, 2021 1 commit
    • Pekka Vuorela's avatar
      Fix duplicate entries on files created and instantly modified · d1b44392
      Pekka Vuorela authored
      On "touch newfile.jpg; cp oldfile.jpg newfile.jpg" the mime type
      was first detected as text/plain after the touch call and afterwards
      as proper type. With file added to two type specific graphs, the
      tracker extractor query for files without extractorHash listed the
      same file twice and the extracted content got also added twice.
      
      Side-effect, of course, is that empty files are no longer available
      outside tracker:FileSystem.
      
      Relates to #200
      d1b44392
  7. 24 Oct, 2021 1 commit
  8. 23 Oct, 2021 3 commits
  9. 19 Oct, 2021 1 commit
  10. 17 Oct, 2021 1 commit
  11. 21 Sep, 2021 1 commit
  12. 05 Jul, 2021 1 commit
  13. 03 Jul, 2021 1 commit
    • Nishit Patel's avatar
      tracker-miner-files: save file creation time · bb96cec3
      Nishit Patel authored
      Add support for storing the creation time in the database.
      as GLib version 2.70 (glib!2017) will provide
      `g_file_info_set_creation_date_time`, Also make `GDateTime`
      as standard way of storing time in tracker-miners instead
      of storing time as string
      
      Closes: #158
      bb96cec3
  14. 08 Jun, 2021 1 commit
  15. 27 Mar, 2021 1 commit
    • Carlos Garnacho's avatar
      tracker-miner-fs: Fall back if no modification date is found · c6beebb1
      Carlos Garnacho authored
      There are paths in GIO (mainly EACCES errors during stat) that we
      may get returned no error, but a GFileInfo that does not contain
      all the requested information. Cases that may trigger this are:
      
        - Odd permission patterns. I was able to reproduce with a
          directory from another user with 744 permissions, stat would
          then cause EACCES with the directory contents.
        - Other kernel reasons to deny access (SELinux, AppArmor, etc).
      
      This used not to be a problem, as only modification/access times
      missing used to be relevant to us, and we dealt with them as int64_t
      so we silently dealt with the returned 0, thus we set those times
      as being the "epoch". Now we try to get a GDateTime, and get a NULL
      pointer instead, causing crashes in its manipulation.
      
      As we need files to have a modification time for our comparisons
      during crawling, make these files have again a fictional date set
      in the epoch to avoid the crash and make the machinery work with
      these files.
      c6beebb1
  16. 22 Mar, 2021 1 commit
  17. 14 Mar, 2021 1 commit
  18. 05 Mar, 2021 1 commit
  19. 05 Jan, 2021 1 commit
    • Carlos Garnacho's avatar
      tracker-miner-fs: Interpret GFileInfo uint64 times as time_t · ebec9310
      Carlos Garnacho authored
      This is a time_t underneath, and forcibly interpreting it as unsigned
      will break with negative times on platforms that internally define that
      type as signed (Linux and the GNU C library between them).
      
      Deal with it as a time_t on our side, and let up to the underlying
      implementation the interpretation of negative timestamps. This relies
      on undocumented implementation details and is thus a hack.
      
      Fixes: #155
      ebec9310
  20. 28 Dec, 2020 1 commit
    • Carlos Garnacho's avatar
      tracker-miner-fs: Use g_file_info_get_modification_date_time() · e04b360e
      Carlos Garnacho authored
      The interpretation of the uint64 G_FILE_ATTRIBUTE_TIME_MODIFIED value
      with mtime < 0 is fairly undefined. We interpret it literally as an
      uint64_t, but (signed) time_t is practically simply casted to it
      internally in gio. This makes negative dates seem far far in the
      future.
      
      Use the GDateTime helper instead, this will a) leave the mtime
      interpretation (bugs included) up to GLib. And b) will implicitly
      make mtimes within the expected range, given Tracker and GDateTime
      limits match.
      
      Fixes: #155
      e04b360e
  21. 25 Dec, 2020 2 commits
    • Carlos Garnacho's avatar
      tracker-miner-fs: Speed up delete of individual files · f88e926b
      Carlos Garnacho authored
      Querying the graphs that contain some data for the given file may be a bit
      expensive, perhaps not that much noticeable during individual deletes, but
      gets to show with many updates batched together.
      
      It is sensibly faster to provide all known graphs to the query, and let
      the individual DELETE operations do nothing in those graphs that don't have
      any data for the file.
      
      Fixes: tracker#182
      f88e926b
    • Carlos Garnacho's avatar
      libtracker-miner: Propagate is_dir through TrackerMinerFS::remove-file · 4dd4b6d9
      Carlos Garnacho authored
      And hook it to the delete_children argument when issuing the corresponding
      SPARQL deletes. Before this, we used to delete recursively on everything
      (files and directories), with a performance hit.
      4dd4b6d9
  22. 11 Dec, 2020 6 commits
  23. 03 Nov, 2020 1 commit
  24. 02 Nov, 2020 2 commits
  25. 23 Oct, 2020 2 commits
  26. 22 Oct, 2020 2 commits