On 2023-12-16 18:36, Holger Hoffstätte wrote: <snip>
The affected machine is an older SandyBridge dektop with a fanless r600 Redwood GPU, using the radeon driver. "Recently" - some time after the last few 6.6.x stable updates - it started to die with GPU lockups. I first blamed this on standby/resume - because why not? - but this turned out to be wrong; the real culprit is DPMS. I use xfce-power-manager as "screensaver" to turn off the display after inacitvity. This can be configured in two ways: "suspend" and "poweroff". I've been using "poweroff" since forever without problems, until now. The symptom is that everything works fine until the screensaver kicks in and tries to turn the monitor off, which sends the radeon driver and the GPU into a complete tailspin.
<snip>
Eventually the screensaver tries to switch off the monitor via DPMS "poweroff" method and this greatly upsets the GPU: Dec 12 20:39:59 ragnarok kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10140msec Dec 12 20:39:59 ragnarok kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000000002 last fence id 0x0000000000000003 on ring 0)
In the meantime I have confirmed that all this is still more complicated: even using the "suspend" method only works after boot, not after a system suspend cycle. Yes, weird but reproducible. I have tried to chase down the problematic release, and as suspected this started to happen with 6.6.5; 6.6.4 is fine. Based on this information I found the offending commits and reverted them in order from 6.6.7, which fixes everything for me: b0399e22ada0 "drm/amd/display: Remove power sequencing check" 45f98fccb1f6 "drm/amd/display: Refactor edp power control" Suggestions on how to proceed would be appreciated. I can report this to -stable and request reverts, but wanted to check with the list first. Thanks, Holger