Skip to content

gdbusconnection: Fix double unref on timeout/cancel sending a message

Philip Withnall requested to merge pwithnall/glib:1264-gdbus-double-unref into main

This appears to fix an intermittent failure seen when sending a D-Bus message with either of a cancellable or a timeout set.

In particular, I can reliably reproduce it with:

meson test gdbus-test-codegen-min-required-2-64 --repeat 10000

It can be caught easily with asan when reproduced. Tracking down the location of the refcount mismatch was a little tricky, but was simplified by replacing a load of g_object_ref (message) calls with g_dbus_message_copy (message, NULL) to switch GDBusMessage handling to using copy semantics. This allowed asan to home in on where the refcount mismatch was happening.

The problem was that send_message_data_deliver_error() takes ownership of the GTask passed to it, but the send_message_with_replace_cancelled_idle_cb() and send_message_with_reply_timeout_cb() functions which were calling it, were not passing in a strong reference as they should have.

Another approach to fixing this would have been to change the transfer semantics of send_message_data_deliver_error() so it was (transfer none) on its GTask. That would probably have resulted in cleaner code, but would have been a lot harder to verify/review the fix, and easier to inadvertently introduce new bugs. Edit: I found more bugs in this area, so eventually did change the transfer semantics of send_message_data_deliver_error(), in commit f01adb3d.

The fact that the bug was only triggered by the cancellation and timeout callbacks explains why it was intermittent: these code paths are typically never hit, but the timeout path may sometimes be hit on a very slow test run.

Signed-off-by: Philip Withnall

Fixes: #1264 (closed)

Closes #1264 (closed)

Edited by Philip Withnall

Merge request reports