GSequence pessimizes itself and slows down
Background: A couple weeks ago I worked on code that resized GtkListStore
to add a column to the model. This involves creating a bigger model, copying data and releasing old model. I then spent a whole day trying to understand why some innocent changes in the code will sometimes dramatically increase timings in subsequent resizes.
Run the snippet below and you will observe that every next iteration is slower.
This happens because GLib implements GSequence
as a treap, but does it incorrectly. A correct treap should have random "priorities" saved on every node, but GLib doesn't save anything and merely derives the priority from node pointer.
I understand (in theory) what happens next as: when GSequence
is freed, the nodes are released to memory allocator in anti-optimal order. When GSequence
is allocated again, the nodes are inserted in worse and worse order on every next iteration. Eventually it saturates on something very close to worst possible case.
Example output from the snippet:
0000 Populate_GSequence=0.48 height=55
0001 Populate_GSequence=0.90 height=88
0002 Populate_GSequence=1.45 height=122
0003 Populate_GSequence=1.73 height=140
0004 Populate_GSequence=2.88 height=180
<...>
0164 Populate_GSequence=75.52 height=2707
0165 Populate_GSequence=74.43 height=2690
0166 Populate_GSequence=75.18 height=2734
0167 Populate_GSequence=76.77 height=2748
0168 Populate_GSequence=77.06 height=2763
There, the first number is the iteration number, second is time in seconds, and third is the height of produced tree. For 1'000'000
items inserted in snippet, the expected height for properly balanced tree is ceil(log2(1'000'000+1))
= 20, but it's easy to see how the tree becomes more and more like a list.
For 1024 items, the output is:
0000 Populate_GSequence=0.00 maxHeight=15
0001 Populate_GSequence=0.00 maxHeight=34
0002 Populate_GSequence=0.00 maxHeight=45
0003 Populate_GSequence=0.00 maxHeight=54
0004 Populate_GSequence=0.00 maxHeight=62
<...>
0401 Populate_GSequence=0.02 maxHeight=954
0402 Populate_GSequence=0.02 maxHeight=955
0403 Populate_GSequence=0.02 maxHeight=956
0404 Populate_GSequence=0.02 maxHeight=956
0405 Populate_GSequence=0.02 maxHeight=956
As it can be seen, it saturates on a case which is very close to worst case of list-like tree (height 956 with 1024 items).
The suggested solution is to "follow the book" and actually create and store random priorities.
Here's the short testing snippet, a more advanced one is in the comments:
#include <glib.h>
#include <chrono>
#include <stdio.h>
using namespace std::chrono;
time_point<system_clock> getTime()
{
return system_clock::now();
}
int main (int argc, char **argv)
{
// Reproducible with low item counts as well. However, with low item
// counts it's harder to see the difference in timings.
const int NUM_ITEMS = 1'000'000;
for (int j = 0; j < 1024; j++)
{
GSequence* sequence = g_sequence_new(NULL);
auto timeBegin = getTime();
{
for (int i = 0; i < NUM_ITEMS; i++)
{
g_sequence_insert_before(g_sequence_get_end_iter(sequence), NULL);
}
}
duration<double> timePopulate = getTime() - timeBegin;
printf(
"%04d Populate_GSequence=%.2f\n",
j,
timePopulate.count()
);
g_sequence_free(sequence);
}
return 0;
}