glib/test/thread-pool-slow: Ensure all unused threads are really stopped

In this tests we wanted to ensure that all the unused threads were stopped, however while we were calling g_thread_pool_stop_unused_threads some threads could still be in the process of being recycled even tough the pool's num_thread values are 0.

In fact, stopping unused threads implies also resetting back the max unused threads to the previous value, and in this test it caused it to go from -1 -> 0 and back to -1, after killing the unused threads we knew about; thus any about-to-be-unused thread that is not killed during this call will be just left around as a waiting unused thread afterwards.

However, if this function was getting called when a thread was in between of calling the user function and the moment it was being recycled (and so when the pool num_threads was updated), but this thread was not counted in unused_threads, we ended up in having a race because all the threads were consumed from our POV, but some were actually not yet unused, and so were kept waiting forever for some new job.

To avoid this in the test, we can ensure that we stop the unused threads until we the number of them is really 0.

Sadly we need to repeat this as we don't have a clear point in which we are sure about the fact that our threads are done, while it would be wrong to stop a thread that is technically not yet marked as unused.

We could also do this in g_thread_pool_stop_unused_threads() itself, but it would make such function to wait for threads to complete, and this is probably not what was expected in the initial API.

Fixes: #2685 (closed)

FYI: I was able to reproduce this reandomly with meson test -C ../_BUILD/glib thread-pool-slow --repeat=500 --num-process=100 --timeout 0.25 --print-errorlogs

Edited by Marco Trevisan

Merge request reports