Thank you all for the replies and sorry for the delay (vacation + flu). This has given me various ideas for experiments and I will try to get to them in the future. For now, the cgroup workaround (described in my first version of the patch, but removed later) will do for us. The purpose of my documentation patch was to make it clearer that hibernation may fail in situations in which suspend-to-RAM works; for instance, when there is no swap, and anonymous pages are over 50% of total RAM. I will send a new version of the patch which hopefully makes this clearer. >From this discussion, it seems that it should be possible to set up swap and hibernation in a way that increases the probability of success when entering hibernation (or maybe make it a certainty?). It would be useful to include such setup in the documentation. I don't know how to do this (yet) but if anybody does, it would be a great contribution. Thanks! On Wed, Jan 8, 2020 at 3:49 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Mon 06-01-20 11:08:56, Luigi Semenzato wrote: > > On Mon, Jan 6, 2020 at 4:53 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > On Thu 26-12-19 14:02:04, Luigi Semenzato wrote: > > > [...] > > > > +Limitations of Hibernation > > > > +========================== > > > > + > > > > +When entering hibernation, the kernel tries to allocate a chunk of memory large > > > > +enough to contain a copy of all pages in use, to use it for the system > > > > +snapshot. If the allocation fails, the system cannot hibernate and the > > > > +operation fails with ENOMEM. This will happen, for instance, when the total > > > > +amount of anonymous pages (process data) exceeds 1/2 of total RAM. > > > > + > > > > +One possible workaround (besides terminating enough processes) is to force > > > > +excess anonymous pages out to swap before hibernating. This can be achieved > > > > +with memcgroups, by lowering memory usage limits with ``echo <new limit> > > > > > +/dev/cgroup/memory/<group>/memory.mem.usage_in_bytes``. However, the latter > > > > +operation is not guaranteed to succeed. > > > > > > I am not familiar with the hibernation process much. But what prevents > > > those allocations to reclaim memory and push out the anonymous memory to > > > the swap on demand during the hibernation's allocations? > > > > Good question, thanks. > > > > The hibernation image is stored into a swap device (or partition), so > > I suppose one could set up two swap devices, giving a lower priority > > to the hibernation device, so that it remains unused while the kernel > > reclaims pages for the hibernation image. > > I do not think hibernation can choose which swap device to use. Having > an additional swap device migh help though because there will be more > space to swap out to. > > > If that works, then it may be appropriate to describe this technique > > in Documentation/power/swsusp.rst. There's a brief mention of this > > situation in the Q/A section, but maybe this deserves more visibility. > > > > In my experience, the page allocator is prone to giving up in this > > kind of situation. But my experience is up to 4.X kernels. Is this > > guaranteed to work now? > > OK, I can see it now. I forgot about the ugly hack in the page allocator > that hibernation is using. If there is no way to make a forward progress > for the allocation and we enter the allocator oom path (__alloc_pages_may_oom) > pm_suspended_storage() bails out early and the allocator gives up. > > That being said allocator would swap out processes so it doesn't make > much sense to do that pro-actively. It can still fail if the swap is > depleted though and then the hibernation gives up. This makes some sense > because you wouldn't like to have something killed by the oom killer > while hibernating right? Graceful failure should be a preferable action > and let you decide what to do IMHO. > -- > Michal Hocko > SUSE Labs