Racy NULL-ptr segfault in vte::terminal::update_repeat_timeout()
Hi!
I'm a co-maintainer of Guake; we hit a crash in libvte 0.60.3
The backtrace looks like this:
#0 0x00007fffef34cf5c in vte::terminal::Terminal::emit_adjustment_changed() () at /usr/lib/libvte-2.91.so.0
#1 0x00007fffef35d7c0 in vte::terminal::Terminal::process(bool) () at /usr/lib/libvte-2.91.so.0
#2 0x00007fffef35dae1 in vte::terminal::update_repeat_timeout(void*) () at /usr/lib/libvte-2.91.so.0
#3 0x00007ffff6d10764 in () at /usr/lib/libglib-2.0.so.0
#4 0x00007ffff6d10340 in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0
#5 0x00007ffff6d5e1d9 in () at /usr/lib/libglib-2.0.so.0
#6 0x00007ffff6d0ec03 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#7 0x00007ffff565608f in gtk_main () at /usr/lib/libgtk-3.so.0
[...]
And I'm able to confirm that the call of vte::terminal::update_repeat_timeout(void*)
gets NULL as its argument. Then this line runs with that == NULL
: https://gitlab.gnome.org/GNOME/vte/-/blob/vte-0-60/src/vte.cc#L10426
Here's the Guake-side issue, as originally reported: https://github.com/Guake/guake/issues/1749
It may show a bit of context necessary to reproduce this:
- use
spawn_sync
15-20 times in quick succession; - invoke
gtk_main()
, get SIGSEGV.
Me and other contributors have tried several times to minimize the repro (to have a small standalone script triggering the segfault), but it isn't easy enough; the best current repro steps are:
- install Guake, run it
- check that "Automatically save session ..." preferences item is enabled
- spawn 20 tabs
- exit Guake
- run Guake again
Sometimes, it won't crash at all. Some other times, it crashes with gtk assertion violations. Although most of the time it's just SIGSEGV, I'm inclined to expect an element of concurrency to be at play and some data race to be present.