On Thu, Feb 20, 2020 at 9:28 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
Nope, it's a major clue! This comes up all the time with performance
complaints about swap, oh hey throw this vm.swappiness=0 spaghetti at
the wall. It's entirely plausible this is the origin of this 50%
business. I'll ask about it on linux-mm@. And I think you're correct
to point out that this needs to be a documented consequence of using
this value.
But how in the world did you get suspicious of your custom
vm.swappiness value? :D
Once you told me it works for you, it had to be something about my system. And there are not that many places where I changed some default values related to swapping...
Even in the idealized VM environment, I've had a couple failures just
after hibernation entry where the VM hangs indefinitely and has to be
killed off. I didn't have a serial console setup to see if the problem
can be captured via virsh console.
I saw the same problem (the system not taking any input, having a black/frozen screen, its CPU going wild, and never powering off) even on bare metal. But it occurs extremely rarely. I assume it's some kind of a firmware bug or a race condition in the kernel. Sometimes, after a hard poweroff and reboot, the system actually resumed! So the hibernation image was saved fine, just power-off got stuck. In other times, it did not resume. It's inconvenient, but very rare for me. (I had much more troubles with suspend-to-ram on my Thinkpad T480s, which often got automatically resumed a few minutes after suspend, until I figured I need to disable XHC in /proc/acpi/wakeup. Hibernation woes were golden compared to this.)
On Thu, Feb 20, 2020 at 10:33 AM Kamil Paral <kparal@xxxxxxxxxx> wrote:
>
> On Wed, Feb 19, 2020 at 10:13 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>>
>> My test: Fedora Workstation 31, laptop with 8G RAM, 8G swap partition,
>> fill up memory using Firefox tabs pointed to various websites, and
>> then I followed [1] to issue two commands:
>>
>> # echo reboot > /sys/power/disk
>> # echo disk > /sys/power/state
>>
>> I experience twice as many failures as successes. Curiously, the
>> successes show pageout does happen. Before hibernate there is no swap
>> in-use, but after resume ~2GiB swap is in-use and RAM usage is about
>> 50%.
>
>
> Sigh. Turns out this is "my" mistake. 🤦 Hibernation apparently gets affected by sysctl value vm.swappiness, in my case vm.swappiness=0. When the value is zero, the hibernation never swaps out the extra memory over 50% and therefore can't hibernate. When I set it to any positive value (including 1), it works as you described. And all those people on kernel mailing lists probably also used vm.swappiness=0 and didn't realize. This might even be a kernel bug, because the documentation doesn't specify this should affect hibernation behavior, and I'd expect it _should_ affect only live system usage and not hibernation. But I can't really tell.
>
> On Wed, Feb 19, 2020 at 10:13 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>>
>> My test: Fedora Workstation 31, laptop with 8G RAM, 8G swap partition,
>> fill up memory using Firefox tabs pointed to various websites, and
>> then I followed [1] to issue two commands:
>>
>> # echo reboot > /sys/power/disk
>> # echo disk > /sys/power/state
>>
>> I experience twice as many failures as successes. Curiously, the
>> successes show pageout does happen. Before hibernate there is no swap
>> in-use, but after resume ~2GiB swap is in-use and RAM usage is about
>> 50%.
>
>
> Sigh. Turns out this is "my" mistake. 🤦 Hibernation apparently gets affected by sysctl value vm.swappiness, in my case vm.swappiness=0. When the value is zero, the hibernation never swaps out the extra memory over 50% and therefore can't hibernate. When I set it to any positive value (including 1), it works as you described. And all those people on kernel mailing lists probably also used vm.swappiness=0 and didn't realize. This might even be a kernel bug, because the documentation doesn't specify this should affect hibernation behavior, and I'd expect it _should_ affect only live system usage and not hibernation. But I can't really tell.
It can still fail with vm.swappiness=60
https://lore.kernel.org/linux-mm/CAA25o9Q=36fiYHtbpcPPmGEPnORm2ZM7MfqRcsvNxsO0Sys9ng@xxxxxxxxxxxxxx/T/#m68bc843d84284cbd1ea11b543b735ae33e0a8696
And I thought the mystery was resolved. Fun.
So, if you want to debug this further, here are my /proc/meminfo with vm.swappiness=0:
I wonder if the fellow Fedora
contributor's workload has a lot of file pages, so that discarding
them is enough for the image allocator to succeed. In that case "sync;
echo 1 > /proc/sys/vm/drop_caches" would be a better way of achieving
the same result.
This is not the case, the case above works/fails the same even if I drop caches beforehand.
Here are memory printouts with vm.swappiness=1:
I also tried hibernating with almost completely full RAM (16 GB RAM mostly full, 16 GB swap empty), and not just slightly over half, and it worked perfectly (with vm.swappiness=1). After resume, there were half the pages stored in swap and the other half in RAM. So I don't really know why it doesn't work for Luigi with vm.swappiness=60.
_______________________________________________ desktop mailing list -- desktop@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to desktop-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/desktop@xxxxxxxxxxxxxxxxxxxxxxx