Skip to content

Draft: [th/gobject-no-object-locks-1] gobject: replace object locks by GData lock (part2)

Thomas Haller requested to merge th/gobject-no-object-locks-1 into main

STATUS: waiting on !3858 (closed), it shares patches with that branch


Originally, we had various global locks in GObject.

Those got replaced in !3774 (merged), by per-object locking. GObject now uses a per-object bitlock in optional_flags and the object_bit_lock() functions.

Note that g_weak_ref_set() previously had a global GRWLock. That lock was replaced by a per-object lock in !3834 (merged).

Anyway. At various places we can do even better than object_bit_lock(). Note that those places that use those locks also store data in the GObject's GData. Note that accessing a GData already takes a mutex (bitlock) to access the data. We can use that bitlock to perform all our critical code while holding that per-object (per-GData) lock. For that, use the internal g_datalist_id_update_atomic() function that takes a callback to perform non-trivial, arbitrary operations while holding the lock. This way, we can completely drop object_bit_lock() and only use the GData lock, that we anyway need to access the data.

This branch is on top of branch th/gobject-notify-queue (!3858 (closed)), which implements part1. This is part2.

This also fixes the issues #743 (closed) and #599 (closed), which have some overlapping intent (but different approaches that were never completed).


I think this is great, because in the single-threaded (or distinct objects on multiple threads) case this is clearly faster, as it achieves the same with less operations (no object_bit_lock(), fewer GData accesses).

Note that now we may perform more work while holding a GData lock. For example, toggle_refs_unref_cb() now also performs a linear search and malloc()/realloc()/free() operation. This means, if you use the object from multiple threads, there previously might have been access patterns where you achieved higher parallelism, which now contend the GData lock.

However:

  • most objects are used by one thread at a time, so the approach benefits those.
  • the "more work while holding a GData" is very moderate.
    • Note that the linear search approach that multiple of these functions have, only works, if the number of tracked things is small. E.g. if you have hundreds of toggle-references, closures or g_object_weak_ref(), then performance breaks down for other reasons too. If you have few entries, it works well and is fast.
    • there is at worst only one additional malloc()/realloc()/free() while holding the GData lock. Note how g_datalist_id_set_data() already sometimes allocates memory while holding the lock. So this only makes it worse by a constant factor. Maybe we can improve that a bit (for example, the buffer groth strategy in toggle_refs_ref_cb() seems bad).
  • I think, you'd be hard pressed to find a realistic benchmark, that shows doing more work (taking more locks) ends up being faster. Maybe it's possible. But if you hammer an object so much from multiple threads so much that you can measure this, you probably need to rethink your approach.
Edited by Thomas Haller

Merge request reports