resume from suspend to RAM not working properly with / on btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

TL,DR: I have problems with resuming from suspend to RAM on my new Ryzen computer. I have only seen this happen if I put root onto a btrfs subvolume, not on ext4. The proprietary nvidia driver seems to be one additional factor, but I have also seen this with the nvidia driver removed. Experienced something similar?


I have upgraded my PC with a Ryzen 7 3700X, Asus ROG STRIX x570-E Gaming and a Samsung 970 EVO PLUS NVMe SSD. The existing Fedora 32 system from my old SATA SSD worked flawlessly (with suspend to RAM). For the last years I have always used ext4 (previously ext3) on monolithic root partitions (no separate /boot or /home, but separate data partitions) on mbr partitioned SATA disks, booting in legacy BIOS mode. With the new drive I wanted to make the switch to GPT, UEFI boot and btrfs (ext4 /boot). I didn't want to install a new system (with those months of finding programs you have not yet installed) but opted to just copy over my old F32 system. So I used gparted to set up the GPT and a 200MiB EFI System partition, three 500MiB boot partitions (for different distributions or Fedora versions), a 16GiB swap partition and the rest of the drive as btrfs. In the btrfs volume I created a fedora32 subvolume with a nested home subvolume. I then mounted everything (/, /boot, /boot/efi) on my old f30 system and copied over everything from my f32 partition. After bind mounting /sys, /proc and /dev I chrooted into the new copy, adjusted the fstab, installed all efi related packages, ran grub2-mkconfig and made sure the kernel paths in /boot/loader/entries were correct. I then switched to the system rescue mode of a f32 netinstall USB drive booted in UEFI mode (to get access to the efivars) to install grub with target x86_64-efi and regenerate the initrds. After that everything booted up and seemed to work until I tried suspend to RAM. It went to sleep properly, but resuming did not complete. After waking up it just continued to display the last four kernel messages of the suspend action (suspending processes, ..., suspending terminal). It reacted to emergency sync sysrq (HDD LED blinking) but the other sysrq keys did not seem to work ("u" also provoked a blinking LED sometimes). This happened from within KDE as well as from text terminal with systemctl suspend. Log files after reboot just had entries until shortly before suspend (processes suspended, all except the last CPU core disabled, unneeded drives stopped) but not from the attempt to resume. I assumed this to be caused by the NVMe-SSD and unsuccessfully tried some suggested solutions that have worked for others with suspend problems with NVMe-SSDs (disabling acpiphp, disabling d3cold_allowed). Since I had too many variables I trashed the content of the new SSD and started anew with a mbr partition table to boot in legacy BIOS mode. I just plain cloned the original f32 partition to the NVMe SSD, adjusted the fstab, updated grub.cfg, recreated the initrds installed grub to the mbr and everything worked, including suspend. I then again did another copy with btrfs root (and ext4 /boot), this time on MBR with BIOS boot and it again showed the previous suspend problem. No swap space this time. I also did a new install of F32 (from Everything Netinstall with Plasma Workspace profile) with btrfs root and ext4 /boot, which suspended correctly at the beginning but failed to resume after I installed the proprietary nvidia driver for my graphics card. Removing the nvidia driver (and updating grub.cfg and the initrds) returned that install to a working state. I then removed the nvidia driver also on the second non-working copy of my old system (checked that "lsmod | grep nvidia" does not show anything), but suspend still did not work. It did not show the kernel messages but just a black screen with frozen mouse pointer. So the nvidia driver seems to be one way to trigger it but there apparently are other ways to reach the non-working state.

I have now trashed everything again and settled for GPT, UEFI and root on ext4 (no separate /boot) with /home on a btrfs subvolume as a compromise. This seems to be working fine. As I now have a btrfs /home my problem is also likely not caused by having files open on a btrfs partition.

The problems were with kernels 5.7.9-200.fc32 and 5.7.10-201.fc32 . I should likely also have tried an older kernel, but have not yet done so (might try to get a new non-working test setup tomorrow). Nvidia driver packages were version 440.100 from rpmfusion on the new install and a rebuild of the f33 packages of 450.57 for the existing install.

My hardware:
- AMD Ryzen 7 3700X
- Asus ROG STRIX x570-E Gaming (latest BIOS version 2407)
- Samsung 970 EVO PLUS NVMe SSD
- Geforce GTX960

Tested setups:
Old Ext4 on MBR, SATA: working
copy of old Ext4 on MBR, NVMe: working
copy to BTRFS (Ext4 /boot, with nvidia) on GPT, UEFI, NVMe: not working
copy to BTRFS (Ext4 /boot, with nvidia) on MBR, BIOS, NVMe: not working
copy BTRFS (Ext4 /boot; nvidia removed) on MBR, BIOS, NVMe: not working
new on BTRFS (Ext4 /boot, w/o nvidia) on MBR, BIOS, NVMe: working
new on BTRFS (Ext4 /boot, with nvidia) on MBR, BIOS, NVMe: not working
copy on Ext4 (btrfs /home, with nvidia) on MBR, BIOS, NVMe: working
copy on Ext4 (btrfs /home, with nvidia) on GPT, UEFI, NVMe: working

So this seems to be unrelated to the partition table type and the boot mode. If it is related to NVMe this is just one factor. I have just observed it with / on BTRFS. On a new install the proprietary nvidia driver is also needed to trigger this, but on my old install it also occurred with the nvidia driver removed.

Things I have not tried yet (might try when I find the time again):
- older kernel version
- ext4 root but with separate boot partition (unlikely cause)
- non-nvidia graphics card (don't have one)
- logging kernel messages on different device using some serial output
  (there is a way, right?) to see what really is failing

Has anybody else experienced something similar? Is there something I might have missed in the btrfs conversion process?
This might become interesting with F33 with lots of new btrfs systems.

Best regards,

Lukas
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux