Re: [Bug 75101] New: [bisected] s2disk / hibernate blocks on "Saving 506031 image data pages () ..."

Rainer Fiebig <jrf@xxxxxxxxxxx> · Thu, 4 Apr 2019 12:48:52 +0200

Am 03.04.19 um 22:05 schrieb Matheus Fillipe:
> Okay I found a way to get it working and there was also a huge mistake
> on my last boot-config, the resume was commented :P
> I basically followed this: https://askubuntu.com/a/1064114
> but changed to:
> resume=/dev/disk/by-uuid/70d967e6-ad52-4c21-baf0-01a813ccc6ac (just
> the uuid wouldnt work) and this is probably the most important thing
> to do.it worked!
> I also set the resume variable in initramfs to my swap partition but
> this might nor be so important anyway since it's automatically
> detected.
> 
> I tested both systemctl hibernate and pm-hibernate, i guess they call
> the same thing anyway. I attached a screenshot. Seems to be working
> fine without uswsusp and with nvidia proprietary drivers!
> 
> On Wed, Apr 3, 2019 at 2:55 PM Rainer Fiebig <jrf@xxxxxxxxxxx> wrote:
>>
>> Am 03.04.19 um 18:59 schrieb Matheus Fillipe:
>>> Yes I can sorta confirm the bug is in uswsusp. I removed the package
>>> and pm-utils
>>
>> Matheus,
>>
>> there is no need to uninstall pm-utils. You actually need this to have
>> comfortable suspend/hibernate.
>>
>> The only additional option you will get from uswsusp is true s2both
>> (which is nice, imo).
>>
>> pm-utils provides something similar called "suspend-hybrid" which means
>> that the computer suspends and after a configurable time wakes up again
>> to go into hibernation.
>>
>> and used both "systemctl hibernate"  and "echo disk >>
>>> /sys/power/state" to hibernate. It seems to succeed and shuts down, I
>>> am just not able to resume from it, which seems to be a classical
>>> problem solved just by setting the resume swap file/partition on grub.
>>> (which i tried and didn't work even with nvidia disabled)
>>>
>>> Anyway uswsusp is still necessary because the default kernel
>>> hibernation doesn't work with the proprietary nvidia drivers as long
>>> as I know  and tested.
>>
>> What doesn't work: hibernating or resuming?
>> And /var/log/pm-suspend.log might give you a clue what causes the problem.
>>
>>>
>>> Is there anyway I could get any workaround to this bug on my current
>>> OS by the way?
>>
>> *I* don't know, I don't use Ubuntu. But what I would do now is
>> re-install pm-utils *without* uswsusp and make sure that you have got
>> the swap-partition/file right in grub.cfg or menu.lst (grub legacy).
>>
>> Then do a few pm-hibernate/resume and tell us what happened.
>>
>> So long!
>>
>>>
>>> On Wed, Apr 3, 2019 at 7:04 AM Rainer Fiebig <jrf@xxxxxxxxxxx> wrote:
>>>>
>>>> Am 03.04.19 um 11:34 schrieb Jan Kara:
>>>>> On Tue 02-04-19 16:25:00, Andrew Morton wrote:
>>>>>>
>>>>>> I cc'ed a bunch of people from bugzilla.
>>>>>>
>>>>>> Folks, please please please remember to reply via emailed
>>>>>> reply-to-all.  Don't use the bugzilla interface!
>>>>>>
>>>>>> On Mon, 16 Jun 2014 18:29:26 +0200 "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx> wrote:
>>>>>>
>>>>>>> On 6/13/2014 6:55 AM, Johannes Weiner wrote:
>>>>>>>> On Fri, Jun 13, 2014 at 01:50:47AM +0200, Rafael J. Wysocki wrote:
>>>>>>>>> On 6/13/2014 12:02 AM, Johannes Weiner wrote:
>>>>>>>>>> On Tue, May 06, 2014 at 01:45:01AM +0200, Rafael J. Wysocki wrote:
>>>>>>>>>>> On 5/6/2014 1:33 AM, Johannes Weiner wrote:
>>>>>>>>>>>> Hi Oliver,
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, May 05, 2014 at 11:00:13PM +0200, Oliver Winker wrote:
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) Attached a full function-trace log + other SysRq outputs, see [1]
>>>>>>>>>>>>> attached.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I saw bdi_...() calls in the s2disk paths, but didn't check in detail
>>>>>>>>>>>>> Probably more efficient when one of you guys looks directly.
>>>>>>>>>>>> Thanks, this looks interesting.  balance_dirty_pages() wakes up the
>>>>>>>>>>>> bdi_wq workqueue as it should:
>>>>>>>>>>>>
>>>>>>>>>>>> [  249.148009]   s2disk-3327    2.... 48550413us : global_dirty_limits <-balance_dirty_pages_ratelimited
>>>>>>>>>>>> [  249.148009]   s2disk-3327    2.... 48550414us : global_dirtyable_memory <-global_dirty_limits
>>>>>>>>>>>> [  249.148009]   s2disk-3327    2.... 48550414us : writeback_in_progress <-balance_dirty_pages_ratelimited
>>>>>>>>>>>> [  249.148009]   s2disk-3327    2.... 48550414us : bdi_start_background_writeback <-balance_dirty_pages_ratelimited
>>>>>>>>>>>> [  249.148009]   s2disk-3327    2.... 48550414us : mod_delayed_work_on <-balance_dirty_pages_ratelimited
>>>>>>>>>>>> but the worker wakeup doesn't actually do anything:
>>>>>>>>>>>> [  249.148009] kworker/-3466    2d... 48550431us : finish_task_switch <-__schedule
>>>>>>>>>>>> [  249.148009] kworker/-3466    2.... 48550431us : _raw_spin_lock_irq <-worker_thread
>>>>>>>>>>>> [  249.148009] kworker/-3466    2d... 48550431us : need_to_create_worker <-worker_thread
>>>>>>>>>>>> [  249.148009] kworker/-3466    2d... 48550432us : worker_enter_idle <-worker_thread
>>>>>>>>>>>> [  249.148009] kworker/-3466    2d... 48550432us : too_many_workers <-worker_enter_idle
>>>>>>>>>>>> [  249.148009] kworker/-3466    2.... 48550432us : schedule <-worker_thread
>>>>>>>>>>>> [  249.148009] kworker/-3466    2.... 48550432us : __schedule <-worker_thread
>>>>>>>>>>>>
>>>>>>>>>>>> My suspicion is that this fails because the bdi_wq is frozen at this
>>>>>>>>>>>> point and so the flush work never runs until resume, whereas before my
>>>>>>>>>>>> patch the effective dirty limit was high enough so that image could be
>>>>>>>>>>>> written in one go without being throttled; followed by an fsync() that
>>>>>>>>>>>> then writes the pages in the context of the unfrozen s2disk.
>>>>>>>>>>>>
>>>>>>>>>>>> Does this make sense?  Rafael?  Tejun?
>>>>>>>>>>> Well, it does seem to make sense to me.
>>>>>>>>>>  From what I see, this is a deadlock in the userspace suspend model and
>>>>>>>>>> just happened to work by chance in the past.
>>>>>>>>> Well, it had been working for quite a while, so it was a rather large
>>>>>>>>> opportunity
>>>>>>>>> window it seems. :-)
>>>>>>>> No doubt about that, and I feel bad that it broke.  But it's still a
>>>>>>>> deadlock that can't reasonably be accommodated from dirty throttling.
>>>>>>>>
>>>>>>>> It can't just put the flushers to sleep and then issue a large amount
>>>>>>>> of buffered IO, hoping it doesn't hit the dirty limits.  Don't shoot
>>>>>>>> the messenger, this bug needs to be addressed, not get papered over.
>>>>>>>>
>>>>>>>>>> Can we patch suspend-utils as follows?
>>>>>>>>> Perhaps we can.  Let's ask the new maintainer.
>>>>>>>>>
>>>>>>>>> Rodolfo, do you think you can apply the patch below to suspend-utils?
>>>>>>>>>
>>>>>>>>>> Alternatively, suspend-utils
>>>>>>>>>> could clear the dirty limits before it starts writing and restore them
>>>>>>>>>> post-resume.
>>>>>>>>> That (and the patch too) doesn't seem to address the problem with existing
>>>>>>>>> suspend-utils
>>>>>>>>> binaries, however.
>>>>>>>> It's userspace that freezes the system before issuing buffered IO, so
>>>>>>>> my conclusion was that the bug is in there.  This is arguable.  I also
>>>>>>>> wouldn't be opposed to a patch that sets the dirty limits to infinity
>>>>>>>> from the ioctl that freezes the system or creates the image.
>>>>>>>
>>>>>>> OK, that sounds like a workable plan.
>>>>>>>
>>>>>>> How do I set those limits to infinity?
>>>>>>
>>>>>> Five years have passed and people are still hitting this.
>>>>>>
>>>>>> Killian described the workaround in comment 14 at
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=75101.
>>>>>>
>>>>>> People can use this workaround manually by hand or in scripts.  But we
>>>>>> really should find a proper solution.  Maybe special-case the freezing
>>>>>> of the flusher threads until all the writeout has completed.  Or
>>>>>> something else.
>>>>>
>>>>> I've refreshed my memory wrt this bug and I believe the bug is really on
>>>>> the side of suspend-utils (uswsusp or however it is called). They are low
>>>>> level system tools, they ask the kernel to freeze all processes
>>>>> (SNAPSHOT_FREEZE ioctl), and then they rely on buffered writeback (which is
>>>>> relatively heavyweight infrastructure) to work. That is wrong in my
>>>>> opinion.
>>>>>
>>>>> I can see Johanness was suggesting in comment 11 to use O_SYNC in
>>>>> suspend-utils which worked but was too slow. Indeed O_SYNC is rather big
>>>>> hammer but using O_DIRECT should be what they need and get better
>>>>> performance - no additional buffering in the kernel, no dirty throttling,
>>>>> etc. They only need their buffer & device offsets sector aligned - they
>>>>> seem to be even page aligned in suspend-utils so they should be fine. And
>>>>> if the performance still sucks (currently they appear to do mostly random
>>>>> 4k writes so it probably would for rotating disks), they could use AIO DIO
>>>>> to get multiple pages in flight (as many as they dare to allocate buffers)
>>>>> and then the IO scheduler will reorder things as good as it can and they
>>>>> should get reasonable performance.
>>>>>
>>>>> Is there someone who works on suspend-utils these days? Because the repo
>>>>> I've found on kernel.org seems to be long dead (last commit in 2012).
>>>>>
>>>>>                                                               Honza
>>>>>
>>>>
>>>> Whether it's suspend-utils (or uswsusp) or not could be answered quickly
>>>> by de-installing this package and using the kernel-methods instead.
>>>>
>>>>
>>
>>

So you got hibernate working now with pm-utils *and* the prop. Nvidia
drivers. That's good - although a bit contrary to what you said in
Comment 29:

> Anyway uswsusp is still necessary because the default kernel
> hibernation doesn't work with the proprietary nvidia drivers as long
> as I know  and tested

Never mind. Stick with it if you don't need s2both.

What still puzzles me is that while others are having problems,
suspend-utils/uswsusp work for me almost 100 % of the time, except for a
few extreme test-cases in the past. You also said that it worked
"flawlessly" for you until you upgraded your system.

So I'm wondering whether used-up swap space might play a role in this
matter, too. At least for the cases that I've seen on my system, I can't
rule this out. And when I look at the screenshot you provided in Comment
27 (https://launchpadlibrarian.net/417327528/i915.jpg), sparse
swap-space could have been a factor in that case as well. Because
roughly 3.5 GB free swap-space doesn't seem much for a 16-GB-RAM box.

Attachment:
signature.asc

Description: OpenPGP digital signature