Spurious crashes and warnings when creating shadows for StIcons
In our (heavily) modified version of the Shell in Endless, I have come across a situation where I can get some CRITICAL warnings spilling on the output and the shell even crashing, related to shadows for StIcons.
When that happens, this is what I usually get in the output:
(gnome-shell:11125): St-CRITICAL **: _st_paint_shadow_with_opacity: assertion 'shadow_spec != NULL' failed
(gnome-shell:11125): St-CRITICAL **: _st_paint_shadow_with_opacity: assertion 'shadow_spec != NULL' failed
(gnome-shell:11125): GLib-ERROR **: ../../../../glib/gmem.c:130: failed to allocate 18446744072098939136 bytes
== Stack trace for context 0x5585f67a6170 ==
#0 0x5585f6cc4508 i resource:///org/gnome/shell/ui/panel.js:887 (0x7fd80dd11890 @ 27)
#1 0x7fffceacf370 I self-hosted:915 (0x7fd8283ee5e8 @ 367)
#2 0x7fffceacf3f0 I resource:///org/gnome/gjs/modules/signals.js:126 (0x7fd8283e2b38 @ 386)
#3 0x5585f6cc4468 i resource:///org/gnome/shell/ui/overview.js:815 (0x7fd828157c48 @ 88)
#4 0x7fffceacffc0 I resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7fd8283c2bc0 @ 71)
#5 0x5585f6cc43c8 i resource:///org/gnome/shell/ui/overview.js:804 (0x7fd828157bc0 @ 278)
#6 0x7fffcead0b90 I resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7fd8283c2bc0 @ 71)
#7 0x5585f6cc42f0 i resource:///org/gnome/shell/ui/windowManager.js:2026 (0x7fd80ddd7670 @ 829)
#8 0x7fffcead1770 I resource:///org/gnome/gjs/modules/_legacy.js:82 (0x7fd8283c2bc0 @ 71)
#9 0x7fffcead17d0 I self-hosted:917 (0x7fd8283ee5e8 @ 394)
Trace/breakpoint trap (core dumped)
The failed to allocate 18446744072098939136 bytes
suggests something is trying to allocate too much memory, and the backtrace from gdb looks like this:
#0 0x00007ffff70770f1 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#1 0x00007ffff7078128 in g_log_default_handler () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x0000555555556845 in default_log_handler (log_domain=0x7ffff70ba50e "GLib", log_level=6, message=0x7fffc0004f40 "../../../../glib/gmem.c:130: failed to allocate 18446744072098939136 bytes", data=0x0) at ../src/main.c:315
#3 0x00007ffff7078344 in g_logv () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4 0x00007ffff707854f in g_log () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#5 0x00007ffff7076d24 in g_malloc0 () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#6 0x00007ffff4afaf1b in blur_pixels (pixels_in=0x555559207650 "", width_in=16, height_in=16, rowstride_in=16, blur=33554432.000000253, width_out=0x7fffffff99d4, height_out=0x7fffffff99d0, rowstride_out=0x7fffffff99cc)
at ../src/st/st-private.c:280
#7 0x00007ffff4afb2fe in _st_create_shadow_pipeline (shadow_spec=0x7fffc82b3930, src_texture=0x5555566feec0) at ../src/st/st-private.c:372
#8 0x00007ffff4afb5d3 in _st_create_shadow_pipeline_from_actor (shadow_spec=0x7fffc82b3930, actor=0x555559299ca0) at ../src/st/st-private.c:434
#9 0x00007ffff4af7740 in st_icon_update_shadow_pipeline (icon=0x555556be3af0) at ../src/st/st-icon.c:278
#10 0x00007ffff4af7844 in st_icon_finish_update (icon=0x555556be3af0) at ../src/st/st-icon.c:310
#11 0x00007ffff4af7a93 in st_icon_update (icon=0x555556be3af0) at ../src/st/st-icon.c:381
#12 0x00007ffff4af747f in st_icon_style_changed (widget=0x555556be3af0) at ../src/st/st-icon.c:206
[...]
After a debugging session, I found the problem is that the value of shadow_spec->blur
passed to blur_pixels()
seems completely wrong (33554432.000000253
in this particular case), which is no surprise seems the shadow_spec
object seems to be technically dead (ref_count == 0
) at that point:
(gdb) p *shadow_spec
$3 = {color = {red = 0 '\000', green = 0 '\000', blue = 0 '\000', alpha = 0 '\000'}, xoffset = 0, yoffset = 0, blur = 33554432.000000253, spread = 33554440.1875, inset = 1098907648, ref_count = 0}
After some investigation, I think all we need to silent the warnings is an extra NULL-check for priv->shadow_spec
in st-icon.c (as it's done everywhere else in that file).
For the crash I think the problem is that the call to clutter_actor_get_allocation_box()
in _st_create_shadow_pipeline_from_actor()
is causing a re-layout that is invalidating the shadow_spec
at that precise moment. I base this theory on that I put some prints there and I can consistently see how shadow_spec
is valid exactly until calling clutter_actor_get_allocation_box()
, and that when this problem happens I can also consistently see how clutter_actor_has_allocation()
returns FALSE
for that actor.