Skip to content

gthread: Move thread _impl functions to static inlines for speed

Philip Withnall requested to merge pwithnall/glib:3417-mutex-speedup into main

The changes made in commit bc59e28b (issue #3399 (closed)) fixed introspection of the GThread API. However, they introduced a trampoline in every threading function. So with those changes applied, the disassembly of g_mutex_lock() (for example) was:

0x7ffff7f038b0 <g_mutex_lock>    jmp 0x7ffff7f2f440 <g_mutex_lock_impl>
0x7ffff7f038b5                   data16 cs nopw 0x0(%rax,%rax,1)

i.e. It jumps straight to the _impl function, even with an optimised build. Since g_mutex_lock() (and various other GThread functions) are frequently run hot paths, this additional jmp to a function which has ended up in a different code page is a slowdown which we’d rather avoid.

So, this commit reworks things to define all the _impl functions as G_ALWAYS_INLINE static inline (which typically expands to __attribute__((__always_inline__)) static inline), and to move them into the same compilation unit as gthread.c so that they can be inlined without the need for link-time optimisation to be enabled.

It makes the code a little less readable, but not much worse than what commit bc59e28b already did. And perhaps the addition of the inline decorations to all the _impl functions will make it a bit clearer what their intended purpose is (platform-specific implementations).

After applying this commit, the disassembly of g_mutex_lock() successfully contains the inlining for me:

=> 0x00007ffff7f03d80 <+0>:	xor    %eax,%eax
   0x00007ffff7f03d82 <+2>:	mov    $0x1,%edx
   0x00007ffff7f03d87 <+7>:	lock cmpxchg %edx,(%rdi)
   0x00007ffff7f03d8b <+11>:	jne    0x7ffff7f03d8e <g_mutex_lock+14>
   0x00007ffff7f03d8d <+13>:	ret
   0x00007ffff7f03d8e <+14>:	jmp    0x7ffff7f03610 <g_mutex_lock_slowpath>

I considered making a similar change to the other APIs touched in #3399 (closed) (GContentType, GAppInfo, GSpawn), but they are all much less performance critical, so it’s probably not worth making their code more complex for that sake.

Signed-off-by: Philip Withnall pwithnall@gnome.org

Fixes: #3417 (closed)

Closes #3417 (closed)

Merge request reports