Skip to content
  • Carlos Garnacho's avatar
    libtracker-miner: Gather as many SPARQL updates as possible for every batch · 760cc6b2
    Carlos Garnacho authored
    We currently block the processing queue if the parent is seen in any stage
    of processing, the situation is unblocked by flushing early, so processing
    can resume after the SPARQL updates were performed.
    
    This may lead to suboptimal buffer occupation, ultimately dependent on
    the filesystem layout.
    
    To improve this situation, rely on blank node labels being stable across
    the whole SPARQL update string, and add a blank node labeling scheme that
    allows files within a same SPARQL batch reference each other through these
    blank node labels instead of IRIs.
    
    This allows maximum buffer occupation regardless of the filesystem layout,
    we still have to wait after a SPARQL update if a file being processed
    references (i.e. child/parent relationship) another file added in the
    SPARQL update being currently done. But that happens once per batch,
    instead of once per folder.
    760cc6b2