Skip to content

Dirdiff mmap - improve dirdiff performance using mmap

Hugo Sena Ribeiro requested to merge hugosenari/meld:dirdiff-mmap into master

Changes:

  • Use mmap for files with size greater than CHUNK_SIZE (curent set to 4096b);
  • When "ignore blank line" retain content for rematch after remove blank lines;
  • Remove blank lines now ignores lines with only spaces;
  • All_same accept iter and runs faster;
  • Convert large data to generators and small data to list;
  • Use regex to normalize line ending;
  • Apply regex.sub instead of apply_text_filters;
  • Add tests for _files_same, all_same and remove_blank_lines.

Test executed:

  • empty files (fastest equal, wont read files)
  • 1b vs 1b file (fast equal, read both until end)
  • 4mb vs 4mb file (slow equal, read both until end)
  • empty vs 1b file (fast different, first chunk diff)
  • 1b vs 4mb file (fast different, first chunk diff)
  • 4mb vs 4mb file (slow different, read both until end)

cProfile Results:

  • master branch: 34291 function calls in 0.337 seconds
  • this branch: 1115 function calls in 0.069 seconds
Edited by Hugo Sena Ribeiro

Merge request reports