Hi,
TL,DR: I have problems with resuming from suspend to RAM on my new Ryzen
computer. I have only seen this happen if I put root onto a btrfs
subvolume, not on ext4. The proprietary nvidia driver seems to be one
additional factor, but I have also seen this with the nvidia driver
removed. Experienced something similar?
I have upgraded my PC with a Ryzen 7 3700X, Asus ROG STRIX x570-E Gaming
and a Samsung 970 EVO PLUS NVMe SSD.
The existing Fedora 32 system from my old SATA SSD worked flawlessly
(with suspend to RAM).
For the last years I have always used ext4 (previously ext3) on
monolithic root partitions (no separate /boot or /home, but separate
data partitions) on mbr partitioned SATA disks, booting in legacy BIOS mode.
With the new drive I wanted to make the switch to GPT, UEFI boot and
btrfs (ext4 /boot). I didn't want to install a new system (with those
months of finding programs you have not yet installed) but opted to just
copy over my old F32 system.
So I used gparted to set up the GPT and a 200MiB EFI System partition,
three 500MiB boot partitions (for different distributions or Fedora
versions), a 16GiB swap partition and the rest of the drive as btrfs. In
the btrfs volume I created a fedora32 subvolume with a nested home
subvolume. I then mounted everything (/, /boot, /boot/efi) on my old f30
system and copied over everything from my f32 partition. After bind
mounting /sys, /proc and /dev I chrooted into the new copy, adjusted the
fstab, installed all efi related packages, ran grub2-mkconfig and made
sure the kernel paths in /boot/loader/entries were correct. I then
switched to the system rescue mode of a f32 netinstall USB drive booted
in UEFI mode (to get access to the efivars) to install grub with target
x86_64-efi and regenerate the initrds.
After that everything booted up and seemed to work until I tried suspend
to RAM. It went to sleep properly, but resuming did not complete. After
waking up it just continued to display the last four kernel messages of
the suspend action (suspending processes, ..., suspending terminal). It
reacted to emergency sync sysrq (HDD LED blinking) but the other sysrq
keys did not seem to work ("u" also provoked a blinking LED sometimes).
This happened from within KDE as well as from text terminal with
systemctl suspend. Log files after reboot just had entries until shortly
before suspend (processes suspended, all except the last CPU core
disabled, unneeded drives stopped) but not from the attempt to resume.
I assumed this to be caused by the NVMe-SSD and unsuccessfully tried
some suggested solutions that have worked for others with suspend
problems with NVMe-SSDs (disabling acpiphp, disabling d3cold_allowed).
Since I had too many variables I trashed the content of the new SSD and
started anew with a mbr partition table to boot in legacy BIOS mode. I
just plain cloned the original f32 partition to the NVMe SSD, adjusted
the fstab, updated grub.cfg, recreated the initrds installed grub to the
mbr and everything worked, including suspend.
I then again did another copy with btrfs root (and ext4 /boot), this
time on MBR with BIOS boot and it again showed the previous suspend
problem. No swap space this time.
I also did a new install of F32 (from Everything Netinstall with Plasma
Workspace profile) with btrfs root and ext4 /boot, which suspended
correctly at the beginning but failed to resume after I installed the
proprietary nvidia driver for my graphics card. Removing the nvidia
driver (and updating grub.cfg and the initrds) returned that install to
a working state.
I then removed the nvidia driver also on the second non-working copy of
my old system (checked that "lsmod | grep nvidia" does not show
anything), but suspend still did not work. It did not show the kernel
messages but just a black screen with frozen mouse pointer. So the
nvidia driver seems to be one way to trigger it but there apparently are
other ways to reach the non-working state.
I have now trashed everything again and settled for GPT, UEFI and root
on ext4 (no separate /boot) with /home on a btrfs subvolume as a
compromise. This seems to be working fine. As I now have a btrfs /home
my problem is also likely not caused by having files open on a btrfs
partition.
The problems were with kernels 5.7.9-200.fc32 and 5.7.10-201.fc32 . I
should likely also have tried an older kernel, but have not yet done so
(might try to get a new non-working test setup tomorrow).
Nvidia driver packages were version 440.100 from rpmfusion on the new
install and a rebuild of the f33 packages of 450.57 for the existing
install.
My hardware:
- AMD Ryzen 7 3700X
- Asus ROG STRIX x570-E Gaming (latest BIOS version 2407)
- Samsung 970 EVO PLUS NVMe SSD
- Geforce GTX960
Tested setups:
Old Ext4 on MBR, SATA: working
copy of old Ext4 on MBR, NVMe: working
copy to BTRFS (Ext4 /boot, with nvidia) on GPT, UEFI, NVMe: not working
copy to BTRFS (Ext4 /boot, with nvidia) on MBR, BIOS, NVMe: not working
copy BTRFS (Ext4 /boot; nvidia removed) on MBR, BIOS, NVMe: not working
new on BTRFS (Ext4 /boot, w/o nvidia) on MBR, BIOS, NVMe: working
new on BTRFS (Ext4 /boot, with nvidia) on MBR, BIOS, NVMe: not working
copy on Ext4 (btrfs /home, with nvidia) on MBR, BIOS, NVMe: working
copy on Ext4 (btrfs /home, with nvidia) on GPT, UEFI, NVMe: working
So this seems to be unrelated to the partition table type and the boot
mode. If it is related to NVMe this is just one factor. I have just
observed it with / on BTRFS. On a new install the proprietary nvidia
driver is also needed to trigger this, but on my old install it also
occurred with the nvidia driver removed.
Things I have not tried yet (might try when I find the time again):
- older kernel version
- ext4 root but with separate boot partition (unlikely cause)
- non-nvidia graphics card (don't have one)
- logging kernel messages on different device using some serial output
(there is a way, right?) to see what really is failing
Has anybody else experienced something similar? Is there something I
might have missed in the btrfs conversion process?
This might become interesting with F33 with lots of new btrfs systems.
Best regards,
Lukas
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx