Hybrid GPU: plugging in *2* external monitor immediately crashes
while running in hybrid GPU under the following stack:
- mesa: master at a5053ba27e
- mutter: master at
- gnome-shell: master at 0b51ead00
- gnome-shell-extensions: master at ae65a82 (not strictly necessary but to keep dpkg happy)
with the following basic setup:
-[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5910]
+-01.0-[01]----00.0 NVIDIA Corporation GP106M [GeForce GTX 1060]
Linux uini 4.14.0-3-amd64 #1 SMP Debian 4.14.13-1 (2018-01-14) x86_64 GNU/Linux (buster/sid)
with the dGPU powered up (echo ON > /proc/acpi/bbswitch
) and started prior to opening a session (systemctl stop gdm3.service ; modprobe nouveau
) and with an external monitor plugged into both the dGPU's HDMI and DisplayPort outputs (both screens initially blank but not complaining of a lack of signal)
Action:
- start gdm3
systemctl start gdm3.service
Expected:
- the GDM3 login screen appears on a screen, the GDM background pattern shown on the other monitors
Obtained:
- the GDM background pattern appears on all monitors (correct)
- a faint rendering of the Debian logo appears on the internal monitor (as if it was an early frame of the initial animation)
- gdm3 dumps core
The following core dump was obtained:
#0 0x00007f97a3fc6000 in raise (sig=5) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x000056427e3829fb in dump_gjs_stack_on_signal_handler (signo=5) at ../src/main.c:367
#2 0x00007f97a3fc6160 in <signal handler called> () at /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007f97a5d5d6f1 in _g_log_abort (breakpoint=1) at ../../../../glib/gmessages.c:583
#4 0x00007f97a5d5e72c in g_log_default_handler (log_domain=<optimized out>, log_level=<optimized out>, message=<optimized out>, unused_data=<optimized out>) at ../../../../glib/gmessages.c:3104
#5 0x000056427e382ad5 in default_log_handler (log_domain=log_domain@entry=0x7f97a42e010d "mutter", log_level=log_level@entry=6, message=message@entry=0x56427ff4aca0 "Connection to xwayland lost", data=data@entry=0x0)
at ../src/main.c:310
#6 0x00007f97a5d5e9bd in g_logv (log_domain=0x7f97a42e010d "mutter", log_level=G_LOG_LEVEL_ERROR, format=<optimized out>, args=args@entry=0x7ffe6ad870e0) at ../../../../glib/gmessages.c:1370
#7 0x00007f97a5d5eb2f in g_log (log_domain=log_domain@entry=0x7f97a42e010d "mutter", log_level=log_level@entry=G_LOG_LEVEL_ERROR, format=format@entry=0x7f97a42f1110 "Connection to xwayland lost") at ../../../../glib/gmessages.c:1432
#8 0x00007f97a42a54fe in x_io_error (display=<optimized out>) at wayland/meta-xwayland.c:418
#9 0x00007f97a29c42de in _XIOError () at /usr/lib/x86_64-linux-gnu/libX11.so.6
#10 0x00007f97a29c2323 in _XReply () at /usr/lib/x86_64-linux-gnu/libX11.so.6
#11 0x00007f97a29bdb1d in XSync () at /usr/lib/x86_64-linux-gnu/libX11.so.6
#12 0x00007f97a350c2a6 in () at /usr/lib/x86_64-linux-gnu/libgdk-3.so.0
#13 0x00007f97a34de6cf in () at /usr/lib/x86_64-linux-gnu/libgdk-3.so.0
#14 0x00007f97a427abb9 in meta_screen_free (screen=0x56427fbd6ea0 [MetaScreen], timestamp=0) at core/screen.c:858
#15 0x00007f97a426ba56 in meta_display_close (display=0x56427fdbce00 [MetaDisplay], timestamp=0) at core/display.c:1133
#16 0x00007f97a42764a0 in meta_finalize () at core/main.c:296
#17 0x00007f97a42764a0 in meta_run () at core/main.c:650
#18 0x000056427e38241c in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:525
The following traces are visible in dmesg:
[157113.588426] nouveau 0000:01:00.0: gr: TRAP ch 2 [017fb82000 systemd-logind[3020]]
[157113.588433] nouveau 0000:01:00.0: gr: GPC0/TPC0/TEX: 80000041
[157113.588436] nouveau 0000:01:00.0: gr: GPC0/TPC1/TEX: 80000041
[157113.588439] nouveau 0000:01:00.0: gr: GPC0/TPC2/TEX: 80000041
[157113.588442] nouveau 0000:01:00.0: gr: GPC0/TPC3/TEX: 80000041
[157113.588445] nouveau 0000:01:00.0: gr: GPC0/TPC4/TEX: 80000041
[157113.588449] nouveau 0000:01:00.0: gr: GPC1/TPC0/TEX: 80000041
[157113.588451] nouveau 0000:01:00.0: gr: GPC1/TPC1/TEX: 80000041
[157113.588454] nouveau 0000:01:00.0: gr: GPC1/TPC2/TEX: 80000041
[157113.588457] nouveau 0000:01:00.0: gr: GPC1/TPC3/TEX: 80000041
[157113.588460] nouveau 0000:01:00.0: gr: GPC1/TPC4/TEX: 80000041
[157113.588467] nouveau 0000:01:00.0: fifo: read fault at 0008000000 engine 00 [GR] client 01 [GPC0/T1_0] reason 02 [PTE] on channel 2 [017fb82000 systemd-logind[3020]]
[157113.588472] nouveau 0000:01:00.0: fifo: channel 2: killed
[157113.588473] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
[157113.588476] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
[157113.588480] nouveau 0000:01:00.0: systemd-logind[3020]: channel 2 killed!
Then 120 seconds later:
[157326.368056] INFO: task kworker/u16:5:378 blocked for more than 120 seconds.
[157326.368059] Tainted: G U O 4.14.0-3-amd64 #1 Debian 4.14.13-1
[157326.368060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[157326.368061] kworker/u16:5 D 0 378 2 0x80000000
[157326.368103] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
[157326.368104] Call Trace:
[157326.368109] ? __schedule+0x28e/0x880
[157326.368110] schedule+0x28/0x80
[157326.368111] schedule_timeout+0x1f3/0x360
[157326.368115] ? ttm_bo_mem_compat+0x23/0x60 [ttm]
[157326.368117] ? dma_fence_default_wait+0x1f6/0x280
[157326.368117] dma_fence_default_wait+0x1f6/0x280
[157326.368119] ? dma_fence_release+0x90/0x90
[157326.368120] dma_fence_wait_timeout+0x33/0xe0
[157326.368124] drm_atomic_helper_wait_for_fences+0x5d/0xc0 [drm_kms_helper]
[157326.368142] nv50_disp_atomic_commit_tail+0x55/0x3a80 [nouveau]
[157326.368145] process_one_work+0x185/0x380
[157326.368146] worker_thread+0x2e/0x390
[157326.368147] ? process_one_work+0x380/0x380
[157326.368148] kthread+0x118/0x130
[157326.368149] ? kthread_create_on_node+0x70/0x70
[157326.368150] ret_from_fork+0x1f/0x30
[157326.368153] INFO: task kworker/u16:12:1551 blocked for more than 120 seconds.
[157326.368154] Tainted: G U O 4.14.0-3-amd64 #1 Debian 4.14.13-1
[157326.368155] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[157326.368155] kworker/u16:12 D 0 1551 2 0x80000000
[157326.368173] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
[157326.368173] Call Trace:
[157326.368175] ? __schedule+0x28e/0x880
[157326.368176] schedule+0x28/0x80
[157326.368177] schedule_timeout+0x1f3/0x360
[157326.368187] ? nvkm_ioctl+0x100/0x240 [nouveau]
[157326.368196] ? nvif_notify_get+0x93/0xa0 [nouveau]
[157326.368197] ? dma_fence_default_wait+0x1f6/0x280
[157326.368198] dma_fence_default_wait+0x1f6/0x280
[157326.368198] ? dma_fence_release+0x90/0x90
[157326.368199] dma_fence_wait_timeout+0x33/0xe0
[157326.368203] drm_atomic_helper_wait_for_fences+0x5d/0xc0 [drm_kms_helper]
[157326.368219] nv50_disp_atomic_commit_tail+0x55/0x3a80 [nouveau]
[157326.368220] process_one_work+0x185/0x380
[157326.368221] worker_thread+0x2e/0x390
[157326.368222] ? process_one_work+0x380/0x380
[157326.368223] kthread+0x118/0x130
[157326.368223] ? kthread_create_on_node+0x70/0x70
[157326.368224] ret_from_fork+0x1f/0x30
(which repeats)
Looking at /tmp/mutter--debug-log shows a couple interesting bits:
VERBOSE: Binding monitor 0x55da6cee4280/0x049a (0, 0, 2560, 1440) x 60,004940
VERBOSE: Binding monitor 0x55da6cee42d0/227E4LH (2560, 0, 1920, 1080) x 60,000496
VERBOSE: Binding monitor 0x55da6cee4320/227E4LH (4480, 0, 1920, 1080) x 60,000496 # (correct, two monitors of the same make/model)
......
VERBOSE: Setting _NET_WM_STATE with 2 atoms
VERBOSE: Setting _GTK_EDGE_CONSTRAINTS to 170
WORKAREA: Running work area hint computation function
WORKAREA: Computed work area for workspace 1: 0,0 6400 x 1440
WORKAREA: Computed work area for workspace 1 monitor 0: 0,0 2560 x 1440
WORKAREA: Computed work area for workspace 1 monitor 1: 2560,0 1920 x 1080
WORKAREA: Computed work area for workspace 1 monitor 2: 4480,0 1920 x 1080
WORKAREA: Computed work area for workspace 2: 0,0 6400 x 1440
WORKAREA: Computed work area for workspace 2 monitor 0: 0,0 2560 x 1440
WORKAREA: Computed work area for workspace 2 monitor 1: 2560,0 1920 x 1080
WORKAREA: Computed work area for workspace 2 monitor 2: 4480,0 1920 x 1080
WORKAREA: Computed work area for workspace 3: 0,0 6400 x 1440
WORKAREA: Computed work area for workspace 3 monitor 0: 0,0 2560 x 1440
WORKAREA: Computed work area for workspace 3 monitor 1: 2560,0 1920 x 1080
WORKAREA: Computed work area for workspace 3 monitor 2: 4480,0 1920 x 1080
......
# last line received:
......
STACK: Stack op event received: RAISE_ABOVE(0xe00001, 0xa00001; 474)
STACK: MetaStackTracker state
xserver_serial: 347
verified_stack: 0x200013 0x200001 0x200002 0x200003 0x200004 0x200005 0x200006 0x200007 0x200008 0x20000d 0x20000e (gnome-shel) 0x200012 0x200014 0xc00001 0x1000001 0x1400001 0x800001 0x200017 0xa00001 0xe00001 0x800002 0x1200001 0xc00002 0x600001
unverified_predictions: []