Improve performance of database updates
This branch contains many optimizations to the update machinery, from micro-optimizations like avoiding frequent memory allocations, to the moderate refactors like improved buffering of changes and caching/lookups of prepared statements for inserts/updates. No stone has been left unturned, with the purpose of making most of the CPU time massively spent in SQLite itself.
To test this, the benchmark utility has been added a few additional cases observing other usual scenarios (resources being updated, and deleted). Doing 3 runs at the master-ish branch the output looks like:
[carlos@gotera build]$ ./utils/benchmark/tracker-benchmark
Batch size: 5000, Individual test duration: 30 sec
Opening in-memory database…
Test Elements Elems/sec Min Max Avg
Resource batch update (sync) 1744492,876 58149,763 15,564 usec 22,935 usec 17,197 usec
SPARQL batch update (sync) 829438,000 27647,933 34,482 usec 49,075 usec 36,169 usec
Resource modification (sync) 1168280,369 38942,679 23,892 usec 30,113 usec 25,679 usec
Resource insert+delete (sync) 483065,452 16102,182 59,739 usec 68,100 usec 62,103 usec
Prepared statement query (sync) 3188442,681 106281,423 8,000 usec 920,000 usec 9,409 usec
SPARQL query (sync) 467243,361 15574,779 60,000 usec 1,201 msec 64,206 usec
[carlos@gotera build]$ ./utils/benchmark/tracker-benchmark
Batch size: 5000, Individual test duration: 30 sec
Opening in-memory database…
Test Elements Elems/sec Min Max Avg
Resource batch update (sync) 1650610,257 55020,342 16,438 usec 45,125 usec 18,175 usec
SPARQL batch update (sync) 824167,207 27472,240 34,915 usec 51,269 usec 36,400 usec
Resource modification (sync) 1158085,993 38602,866 24,254 usec 33,005 usec 25,905 usec
Resource insert+delete (sync) 476794,732 15893,158 60,505 usec 70,450 usec 62,920 usec
Prepared statement query (sync) 3299845,780 109994,859 8,000 usec 1,067 msec 9,091 usec
SPARQL query (sync) 466000,270 15533,342 61,000 usec 827,000 usec 64,378 usec
[carlos@gotera build]$ ./utils/benchmark/tracker-benchmark
Batch size: 5000, Individual test duration: 30 sec
Opening in-memory database…
Test Elements Elems/sec Min Max Avg
Resource batch update (sync) 1752330,733 58411,024 15,531 usec 26,183 usec 17,120 usec
SPARQL batch update (sync) 807359,799 26911,993 35,254 usec 50,398 usec 37,158 usec
Resource modification (sync) 1175350,002 39178,333 24,298 usec 30,534 usec 25,524 usec
Resource insert+delete (sync) 474601,050 15820,035 59,429 usec 90,752 usec 63,211 usec
Prepared statement query (sync) 3046379,695 101545,990 8,000 usec 1,502 msec 9,848 usec
SPARQL query (sync) 468526,672 15617,556 60,000 usec 805,000 usec 64,031 usec
And on the top of this branch:
[carlos@gotera build]$ ./utils/benchmark/tracker-benchmark
Batch size: 5000, Individual test duration: 30 sec
Opening in-memory database…
Test Elements Elems/sec Min Max Avg
Resource batch update (sync) 2786198,140 92873,271 9,785 usec 14,579 usec 10,767 usec
SPARQL batch update (sync) 1016639,532 33887,984 28,263 usec 45,691 usec 29,509 usec
Resource modification (sync) 1946191,368 64873,046 14,564 usec 23,329 usec 15,415 usec
Resource insert+delete (sync) 1239900,825 41330,028 23,224 usec 28,889 usec 24,195 usec
Prepared statement query (sync) 3239441,244 107981,375 8,000 usec 970,000 usec 9,261 usec
SPARQL query (sync) 473613,353 15787,112 59,000 usec 2,615 msec 63,343 usec
[carlos@gotera build]$ ./utils/benchmark/tracker-benchmark
Batch size: 5000, Individual test duration: 30 sec
Opening in-memory database…
Test Elements Elems/sec Min Max Avg
Resource batch update (sync) 2808449,642 93614,988 9,463 usec 17,203 usec 10,682 usec
SPARQL batch update (sync) 1004343,862 33478,129 28,553 usec 51,018 usec 29,870 usec
Resource modification (sync) 1953141,846 65104,728 14,483 usec 23,614 usec 15,360 usec
Resource insert+delete (sync) 1217449,281 40581,643 23,412 usec 29,943 usec 24,642 usec
Prepared statement query (sync) 3231219,354 107707,312 8,000 usec 1,017 msec 9,284 usec
SPARQL query (sync) 469455,609 15648,520 60,000 usec 1,831 msec 63,904 usec
[carlos@gotera build]$ ./utils/benchmark/tracker-benchmark
Batch size: 5000, Individual test duration: 30 sec
Opening in-memory database…
Test Elements Elems/sec Min Max Avg
Resource batch update (sync) 2768928,978 92297,633 9,402 usec 14,492 usec 10,835 usec
SPARQL batch update (sync) 1006464,592 33548,820 28,591 usec 40,461 usec 29,807 usec
Resource modification (sync) 1934488,844 64482,961 14,750 usec 22,681 usec 15,508 usec
Resource insert+delete (sync) 1231734,098 41057,803 23,346 usec 28,333 usec 24,356 usec
Prepared statement query (sync) 3197596,147 106586,538 8,000 usec 990,000 usec 9,382 usec
SPARQL query (sync) 470088,702 15669,623 60,000 usec 1,199 msec 63,818 usec
Besides SELECT queries staying the same (minus noise), it can be seen that all update operations are helped by this branch, some with more than 2x improvement. This brings us much closer to the same ballpark than raw SQLite, esp with APIs that update the database from TrackerResource.