Hi,
I am experiencing a kernel/GPU crash once every few days on an Nvidia
Optimus system with a secondary display connected to an Nvidia output.
The secondary display turns off suddenly, X freezes, and in most cases
the kernel hangs.
Module parameter `nouveau.config=NvClkMode=15` is in use, but I get the
same behavior without it.
I have captured a variety of log data, but I find these two errors
consistently:
- Asynchronous wait on fence nouveau:systemd-logind
- nouveau 0000:01:00.0: tmr: stalled at ffffffffffffffff
- Fixing recursive fault but reboot is needed!
This is a regression; nouveau was stable with Debian 10 buster (Linux
v4.19). The crashes started after upgrading to Debian 11 bullseye. I
have tested Linux v4.14.290, v4.19.255, and the latest nouveau-next
commit 9622bcb7c72b230d64b7f7d2f9505e17214f3597; all exhibit the same
behavior (with some variation in log output).
Is a userspace change causing a kernel crash? Do I need to try different
versions of libdrm and xf86-video-nouveau userspace components?
I posted more information and log data at:
<https://gitlab.freedesktop.org/drm/nouveau/-/issues/180>
Thanks,
Owen