On Thu, Nov 11, 2021 at 12:07:19AM +0100, Javier Martinez Canillas wrote: > [ adding dri-devel mailing list as Cc ] > > Hello Ilya, > > On 11/10/21 21:02, Ilya Trukhanov wrote: > > Suspend-to-RAM with elogind under Wayland stopped working in 5.15. > > > > This occurs with 5.15, 5.15.1 and latest master at > > 89d714ab6043bca7356b5c823f5335f5dce1f930. 5.14 and earlier releases work > > fine. > > > > git bisect gives d391c58271072d0b0fad93c82018d495b2633448. > > > > That's strange because this patch is just moving code around, there shouldn't > be any functional changes... > > > To reproduce: > > - Use elogind and Linux 5.15.1 with CONFIG_SYSFB_SIMPLEFB=n. > > - Start a Wayland session. I tested sway and weston, neither worked. > > - In a terminal emulator (I used alacritty) execute `loginctl suspend`. > > > > Normally after the last step the system would suspend, but it no longer > > does so after I upgraded to Linux 5.15. After running `loginctl suspend` > > in dmesg I get the following: > > [ 103.098782] elogind-daemon[2357]: Suspending system... > > [ 103.098794] PM: suspend entry (deep) > > [ 103.124621] Filesystems sync: 0.025 seconds > > > > But nothing happens afterwards. > > > > Suspend works as expected if I do any of the following: > > - Revert d391c58271072d0b0fad93c82018d495b2633448. > > - Build with CONFIG_SYSFB_SIMPLEFB=y. > > Can you please share the kernel boot log for any of these cases too ? revert dmesg: https://pastebin.com/BpnMvV2u CONFIG_SYSFB_SIMPLEFB=y dmesg: https://pastebin.com/qSUdQygt > > > - Suspend from tty, even if a Wayland session is running in parallel. > > - Suspend from under an X11 session. > > - Suspend with `echo mem > /sys/power/state`. > > > > If I attach strace to the elogind-daemon process after running > > `loginctl suspend` then the system immediately suspends. However, if > > I attach strace *prior* to running `loginctl suspend` then no suspend, > > and the process gets stuck on a write syscall to `/sys/power/state`. > > > > I "traced" a little bit with printk (sorry, I don't know of a better > > way) and the call chain is as follows: > > state_store -> pm_suspend -> enter_state -> suspend_prepare > > -> pm_prepare_console -> vt_move_to_console -> vt_waitactive > > -> __vt_event_wait > > > > __vt_event_wait just waits until wait_event_interruptible completes, but > > it never does (not until I attach to elogind-daemon with strace, at > > least). I did not follow the chain further. > > > > - Linux version 5.15.1 (lahvuun@lahvuun) (gcc (Gentoo 11.2.0 p1) 11.2.0, > > GNU ld (Gentoo 2.37_p1 p0) 2.37) #51 SMP PREEMPT Tue Nov 9 23:39:25 > > EET 2021 > > - Gentoo Linux 2.8 > > - x86_64 AuthenticAMD > > - dmesg: https://pastebin.com/duj33bY8 > > - .config: https://pastebin.com/7Hew1g0T > > > > Looking at your .config and dmesg output, my guess is that is related to the > fact that you have both CONFIG_FB_EFI=y and CONFIG_DRM_AMDGPU=y. > > The code that adds the "efi-framebuffer" platform device used to be in the > arch/x86/kernel/sysfb.c file but now is in drivers/firmware/sysfb.c, and it > could affect the order in which the device <--> driver matching happens. > > From your kernel boot log: > > ... > [ 0.375796] [drm] amdgpu kernel modesetting enabled. > [ 0.375819] amdgpu: CRAT table disabled by module option > [ 0.375823] amdgpu: Virtual CRAT table created for CPU > [ 0.375831] amdgpu: Topology: Add CPU node > [ 0.375865] amdgpu 0000:0a:00.0: vgaarb: deactivate vga console > [ 0.375911] [drm] initializing kernel modesetting (VEGA10 0x1002:0x687F 0x1DA2:0xE376 0xC3). > ... > [ 0.868997] fbcon: amdgpu (fb0) is primary device > [ 1.004397] Console: switching to colour frame buffer device 240x67 > [ 1.017815] amdgpu 0000:0a:00.0: [drm] fb0: amdgpu frame buffer device > ... > [ 1.133997] efifb: probing for efifb > [ 1.134716] efifb: framebuffer at 0xe0000000, using 8100k, total 8100k > [ 1.135438] efifb: mode is 1920x1080x32, linelength=7680, pages=1 > [ 1.136180] efifb: scrolling: redraw > [ 1.136891] efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0 > [ 1.137638] fb1: EFI VGA frame buffer device > > Usually the efifb is to have early framebuffer output before the native DRM > driver probes, but in your case is the opposite. This wouldn't happen if the > amdpug driver was built as a module. > > Probably before the mentioned commit, the efifb driver was probed earlier and > then the amdgpu driver would had removed the conflicting efifb framebuffer > before registering its DRM device. But that doesn't happen here and the efifb > framebuffer is still around since is registered after the one for the amdgpu. > > Which would explain why also works with CONFIG_SYSFB_SIMPLEFB=y for you, since > in that case a "simple-framebuffer" platform device is added instead of an > "efi-framebuffer". But since neither CONFIG_FB_SIMPLE nor CONFIG_DRM_SIMPLEDRM > are enabled in your kernel config, no device driver will match that device. > > This is just a guess though. Would be good if you could test following cases: > > 1) CONFIG_FB_EFI not set /proc/fb: 0 amdgpu dmesg: https://pastebin.com/c1BcWLEh Suspend-to-RAM works. > 2) CONFIG_FB_EFI=y and CONFIG_DRM_AMDGPU=m /proc/fb before `modprobe amdgpu`: 0 EFI VGA after: 0 amdgpu dmesg: https://pastebin.com/vSsTw2Km Suspend-to-RAM works. > 3) CONFIG_SYSFB_SIMPLEFB=y and CONFIG_FB_SIMPLE=y /proc/fb: 0 amdgpu 1 simple dmesg: https://pastebin.com/ZSXnpLqQ Suspend-to-RAM fails. > > And for each check /proc/fb, the kernel boot log, and if Suspend-to-RAM works. > > If the explanation above is correct, then I would expect (1) and (2) to work and > (3) to also fail. > > Best regards, > -- > Javier Martinez Canillas > Linux Engineering > Red Hat >